# Compare non parametric distributions

#### SiDstats

##### New Member
I generated distributions of travel times of commuters using transportation simulation tools (for different scenarios). The distributions are attached below. I wish to statistically compare each pair of these non-parametric distributions.

Q1. Which test should I use?
There are some tests which compare medians but these distribution can have multiple peaks and therefore, similar median does not mean they belong to same population.

Q2. Can I use chi-square test?

Thanks

#### Attachments

• 160.8 KB Views: 7

#### katxt

##### Active Member
You could try the Kolmogorov–Smirnov test perhaps.

#### SiDstats

##### New Member
What is your research question? Hypothesis?
Null hypothesis is that - distribution belong to same population and they are different only by chance (randomness).
Alt hypothesis - distribution do not belong to same population i.e. the factors varied in each simulation affected the outcome distribution

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Quantile regression - but per @Miner comment, what is the purpose. I will note that if you compare all of these there will be about 1500 tests and you should expect around 75 to be significance by chance given 0.05 alpha and that excludes if you are looking across distributions for differences. You need to plan your analyses with corrections before ever running anything, otherwise you are definitely looking at false discovery and type I errors.

Last edited:

#### Karabiner

##### TS Contributor
Alt hypothesis - distribution do not belong to same population i.e. the factors varied in each simulation affected the outcome distribution
So could your underlying question be formulated as: "are the levels/values of the factors associated with
characteristics of the distribution (shape, level etc.)"?

Perhaps you could tell us something about context, theoretical background, research goals? And about
that factors.

With kind regards

Karabiner

#### Miner

##### TS Contributor
It just clicked that you generated these through simulations. Your number of simulations is probably so large (10k - 1M) that any difference would be statistically significant. I would look for differences of "practical" value instead. Are you interested in reducing travel times, variation in travel times, both? Have you performed a sensitivity analysis on the factors manipulated in the simulation?