The OP could have said that he had already posted the same text at CrossValidated.
So I suggest that we wait with answers....
I am involved in a project which needs statistic and computer science. I am more of a computer guy so I appreciate any statistical help. I am doing most of the process in R.
Step A:
In the project, I have some sensors which count the number of passing objects in minutes and these sensors only detect and count about 5% of the passing objects which have a detectable device (lets call this actual detectability ratio ). Since my interest is to find the best distribution for each hour, I fitted these collected data and found out which distributions were the best (for some hours Weibulls and for the rest Lognormal). This means that I got 24 sets of results (one distribution for each hour). So, I am interested to do Monte Carlo , and evaluate my approach for different detectability ratio ( c in 0.05,0.1,015,...,1) and this is the part which I need help and any help is highly appreciated.
Step B:
Assume for a single hour, Weibull(lambda, k) was the best and I got this information from 'Step A'.
So, I multiplied c in the Weibull distribution and generated some random numbers (P1) as 0>P1=Weibull(c*lambda, k)>60 , then added hour * 60 to each element of P1 (lets call this new set of data as P2) to make it between 0*hour>P2>60*hour, then based on the c, I used Bernoulli distribution ,Bernoulli(length(P2),c), and eliminated those ones with 0 to simulate the actual detectability ratio. For those which Lognormal was the best, I used lnorm(mean+ln(c),sd).
I repeated the whole above process (step B) for all values in c (0.05,...,1).
Problem:
then, I fitted these data which I obtained from 'Step B' to different distribution to find out which distributions are the best (i.e repeating of 'Step A'). The problem is that I never got the same number of results which I got from 'step A' for most of the values in c. I expected to get 24 sets of results for each c (0.05, 0.1,...,1), but only for c greater than 0.7 or 0.8 I got 24 set of results. This means most of my results for c<0.7 were empty sets, which I could not fit any distribution.
I suspect that the problem is related to the either of the following:
1. Since the distributions which I got in 'step A' belong to 5% detectability ratio (the actual detectability ratio, I know that only 5% of passing objects had detectable devices), when I am multiplying them by c in 'step B' to generate random numbers by Weibull (i.e Weibull(c*lambda, k)), I am actually making them smaller. For example, for c=0.05, this will be 5%*0.05=0.0025 (the first 5% is the actual detectability ratio and the second 0.05 is c)
2. Do I need to use Bernoulli distribution to equip the random numbers with detectable device while I am multiplying c in the distribution? If yes,does not this make them even smaller? For example, for c=0.05, this will be 5%*0.05*0.05=0.000125 (the first 5% is actual detectability ratio, second 0.05 is the one that I used for generation random number, and the third 0.05 is the 0.05 which I used for Bernoulli).
Any help is highly appreciated.
Many thanks,
Mohsen
The OP could have said that he had already posted the same text at CrossValidated.
So I suggest that we wait with answers....
Hi Operator,
I have asked my question in another thread in a different way. Is there a possibility that you or either I could delete this post?
Many thanks,
Mohsen
Tweet |