Chi Square goodness of fit test for uniformity for simulation results

#1
Hello,
As per my username, I am more of a logistics person than a statistics person; I know statistics well at an introductory level but not so much beyond that (although I do find it fascinating).

Here is my question: I am using Monte Carlo simulation for analyzing an inventory model (the details of the inventory model aren’t required for my question). My null hypothesis is that the random variable that is the time in the day (call this t) when the reorder point is reached is uniform on the interval (0,1), where 0 is the beginning of the day and 1 is the end of the day. I simulated 100,000 values for t (using a simulation model that simulates demand based on a distribution) and did a chi square goodness of fit test for uniform distribution and (usually) did not reject the null hypothesis. My issue is that my test statistic varies more than I thought it should between simulation runs. I have tried increasing the n (e.g., generating 1,000,000 values for t) and increasing k (the number of classes) and it is still variable. For example, I ran the simulation 10 separate times (each run generates n=100,000 results for t and these are put into and k=20 classes to test for uniformity) and the chi square test statistic for each of the ten simulation runs were 18.01 16.88 22.82 18.48 21.8 19.67 30.28 10.77 23.41 15.02. The critical value (95% significance) is 30.14.

Does anyone have any advice about the nature of the chi square statistics and/or my chosen methodology (including values for n and k) that could cause this, and/or any suggestions for further analysis that could be done on the test statistic results to provide evidence that the null can be rejected?

Thank you.
 
Last edited:

spunky

Can't make spagetti
#2
Hello,
As per my username, I am more of a logistics person than a statistics person; I know statistics well at an introductory level but not so much beyond that (although I do find it fascinating).

Here is my question: I am using Monte Carlo simulation for analyzing an inventory model (the details of the inventory model aren’t required for my question). My null hypothesis is that the random variable that is the time in the day (call this t) when the reorder point is reached is uniform on the interval (0,1), where 0 is the beginning of the day and 1 is the end of the day. I simulated 100,000 values for t (using a simulation model that simulates demand based on a distribution) and did a chi square goodness of fit test for uniform distribution and (usually) did not reject the null hypothesis. My issue is that my test statistic varies more than I thought it should between simulation runs. I have tried increasing the n (e.g., generating 1,000,000 values for t) and increasing k (the number of classes) and it is still variable. For example, I ran the simulation 10 separate times (each run generates n=100,000 results for t and these are put into and k=20 classes to test for uniformity) and the chi square test statistic for each of the ten simulation runs were 18.01 16.88 22.82 18.48 21.8 19.67 30.28 10.77 23.41 15.02. The critical value (95% significance) is 30.14.

Does anyone have any advice about the nature of the chi square statistics and/or my chosen methodology (including values for n and k) that could cause this, and/or any suggestions for further analysis that could be done on the test statistic results to provide evidence that the null can be rejected?

Thank you.
I'm lowkey curious... why would you expect it to vary less? What kind of values would you find if you simulated data from a chi-square distribution at the same sample size and same degrees of freedom? You may be reproducing the DGP (Data Generating Process) with fidelity, even if it doesn't seem like it to you at first glance.
 
#4
Thank you spunky and katxt, my interpretation of your replies is that you are saying something similar to each other and that is that the Chi sq test statistics for my ten samples do, in fact, follow the way that I should expect Chi sq values to be distributed (given the sample size of 10 and 19 degrees of freedom). I need to go for a walk with the dog and think about this a bit, but in the meantime here are some questions that came to mind:
spunky: what I expected (perhaps in my naivety and ignorance) was that my 100,000 values for t in an individual simulation replication would be so close to a uniform distribution that I would get a very low Chi sq value. I further assumed that running it for 100,000 order cycles (inventory-talk) would produce very stable results (i.e., I would get almost the same Chi sq result with each replication of the simulation.)
katxt: similar to what I said above, does your reply mean that if I don't get results through multiple replications that have a mean of Chi sq of about 19 and a SD of about 6.2 then something isn't right (e.g. need to run each replication for more iterations)? Because if it wasn't in fact uniform, then I could be getting much higher values (and thus a much higher mean), so isn't it also possible that I get a lower mean of my Chi sq statistic than the mean of the Chi sq distribution?

If it helps at all, here is a screen shot of my results for one replication of my simulation. I simulate an inventory system and daily demand (random variable; it could be many days between orders, but on the day that the reorder point is reached, I record the time of the day as t, which is between 0 and 1. My hypothesis is that t will have a uniform distribution over that interval and my expectation was that if I run for a large number of order cycles I will get a low Chi sq value all of the time.

1654738179052.png
 

katxt

Well-Known Member
#5
If the simulated samples do in fact come from a uniform distribution, and if you test the fit against a uniform distribution using 20 bins, then no matter how big the sample, the various X2 values you get will be distributed as Chi Square with 19 df. Some X2 values will be large, and some small, and about one time in 20 the X2 value will be more than 30.14.
If the samples don't come from a uniform distribution your X2 values will tend to be higher and, I presume, larger samples will tend to exaggerate this a little, but the X2 values will still be spread out in much the same way as for the uniform situation.