Why are the parameters of weibull not unique for a given data ?

#1
I have a data which is the interval days of customers purchasing products.
I try to estimate the shape and scale params by scipy.stat.weibull_min

But, the parameters returned from the fit function is not unique and when I try to constrain the scale param to be 1, it does not work.

Here is the three results with different ways for input:
shape, loc, scale = scipy.stats.weibull_min.fit(data,floc=1,scale=1)
#constrain scale to be 1 yellow curve
loc:1 shape:0.7318249351 scale:75.22852953

shape, loc, scale = scipy.stats.weibull_min.fit(data,floc=1, f0=1)
#constrain shape to be 1 blue curve
loc:1 shape:1 scale:90.85

shape, loc, scale = scipy.stats.weibull_min.fit(data,floc=1)
#no constrain green curve
loc:1 shape:0.7 scale:127.26

Besides, which curve best fit the original distribution?
 

Miner

TS Contributor
#2
Why are you setting the location parameter = 1? Your histogram appears to show a location parameter = 0, which simplifies to a 2-parameter Weibull. Setting a location parameter =1 is stating that it is impossible for the interval to be < 1.
 
#3
Why are you setting the location parameter = 1? Your histogram appears to show a location parameter = 0, which simplifies to a 2-parameter Weibull. Setting a location parameter =1 is stating that it is impossible for the interval to be < 1.
My data's min is 1 so the hist start from 1. And if I don't specify loc as 1, the output for loc param is also 1.
 

Miner

TS Contributor
#4
In general, you should not constrain any parameters unless you have historical data or theoretical knowledge of what a particular parameter is, or should be. Visually, the green curve (no constraints) appears to be the best fit of the three show. This provides an interesting insight into the data set. A shape parameter less than 1 (i.e., 0.7) indicates that the event rate is decreasing (i.e., time between events is increasing), which is not a good thing in sales.
 
#5
Hi,
the correct way to fix the scale parameter when using scipy.stats distributions is fscale=const. If you write scale=const. you are specifying an initial value for the optimization algorithm that searches parameter values which maximize the likelihood (the .fit(data) method is an MLE).
I agree with Miner in that you should leave all three parameters free and accept the estimators the procedure provides. Fixing some parameters makes sense if you lack data (e.g. you have only 5 values) and have some prior information about the parameter you are fixing.
Your scale (the 63.2 percentile) looks to be around 150 so fixing it at 1 does not make much sense.

Besides, I don't think the Weibull distribution is the proper way to model purchases. Weibull is an extreme value distribution. You should look for a more appropriate distribution for your random process.
 

rogojel

TS Contributor
#6
Besides, I don't think the Weibull distribution is the proper way to model purchases. Weibull is an extreme value distribution. You should look for a more appropriate distribution for your random process.
I always saw Weibull as a very flexibile distribution - surely not ONLY for extreme values: see its uses in reliability theory for example.