View Full Version : Are distributions the "same" in their extreme values?


KE2
01-29-2009, 09:06 AM
Hello,

I have two independent datasets which are measurements of the height of ocean waves. I would like to tell if these datasets have the same distributions. The catch is, it's really important that the distributions of their extreme values (very high waves) are the same, and I care less about more typical waves. What are the tools to compare these distributions for the high values, and also how can I assign error bars so that I can say "within xx error, these two distributions are the same in their extreme values"?

I think skewness and kurtosis will tell me about the similarity of the most common observations, and not the long tail values.

Thanks,
KE

TheEcologist
01-29-2009, 09:35 AM
Hello,

I have two independent datasets which are measurements of the height of ocean waves. I would like to tell if these datasets have the same distributions. The catch is, it's really important that the distributions of their extreme values (very high waves) are the same, and I care less about more typical waves. What are the tools to compare these distributions for the high values, and also how can I assign error bars so that I can say "within xx error, these two distributions are the same in their extreme values"?

I think skewness and kurtosis will tell me about the similarity of the most common observations, and not the long tail values.

Thanks,
KE

You would want to search for tests that look for differences in distribution tails. This springs to mind , I'm not sure it can help why dont you see if it is useful for you:

http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2529340

Masteras
01-29-2009, 09:58 AM
you can absolutely not rely on the skewnesses or kurtosis. Not even the error bars. Do they look similar in the histogram? Try in spss to test the null hypothesis that the distribution do not differ. As for the high values, the distribution to fit extreme values is the Gumbel. You can also try the algorithm for the test in spss via bootstrap.

TheEcologist
01-29-2009, 01:23 PM
you can absolutely not rely on the skewnesses or kurtosis. Not even the error bars. Do they look similar in the histogram? Try in spss to test the null hypothesis that the distribution do not differ. As for the high values, the distribution to fit extreme values is the Gumbel. You can also try the algorithm for the test in spss via bootstrap.

The problem here is that you are not interested in the entire distribution but only in the tail. I believe that the tests in SPPS are especially sensitive for data in high density regions thus they do not necessarily refect any meaningfull differences in areas of low density. Yet here we are only interested in the latter area's and therefore I suspect that they wont be of any use at all for low-frequency, high-severity problems were the area of interest is the tail. However I might be mistaken as I dont really use SPSS for things except teaching and havent checked. Anyway, there are quite a bunch of scientists working on low-frequency, high-severity problems (see urls below). The best test will depend on your real world problem, though we dont really know KE's exact problem.

KE: I guess one way you can solve this is by fitting different distributions (through MLE) with a distinct 'tail' parameter as the Generalized Pareto Distribution. As these parameters only effect the tail you can then calculate se's to test if these parameters differ significantly. You can calculate se's either through use of the hessian or bootstrapping. Here's a tutorial that could get you started: http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/gparetodemo.html


The urls concerning low-frequency, high-severity problems and tail testing :
http://cat.inist.fr/?aModele=afficheN&cpsidt=13549556

http://www.allacademic.com/meta/p_mla_apa_research_citation/1/8/7/4/3/p187437_index.html

http://www.questia.com/googleScholar.qst;jsessionid=JB1SJVqST11vQkGs04Qmt ThTDLHYbM20fry4tJDwgDhc9m93L2yy!-1045821366!-1100836617?docId=79251766