Having a hard time deciding on what test to use for data

NPS

New Member
#1
Here is the premise:

There are large events taking place on park land.
We are using noise disturbance hardware to measure decibels during and not during events.
We have >1000 samples for our control and our experimental.
The data is not normally distributed.
The variances are different.

What test do I use to see if there is a difference between means?

I keep coming to the Mann Whitney test, but I can't find any tutorials on how to perform this test with large sample sizes.

Any suggestions?
 

Karabiner

TS Contributor
#2
The U-test is not a test for means.

You can use the Welch test, which is a t-test corrected
for unequal variances. Since both samples are about
the same size (I suppose), Welch and t-test won't
differ much.

Whether the data (within groups!) is normally distributed
or not, doesn't matter for the t-test with such a big sample
size such as yours.

With kind regards

K.
 

NPS

New Member
#3
The U-test is not a test for means.

You can use the Welch test, which is a t-test corrected
for unequal variances. Since both samples are about
the same size (I suppose), Welch and t-test won't
differ much.

Whether the data (within groups!) is normally distributed
or not, doesn't matter for the t-test with such a big sample
size such as yours.

With kind regards

K.
The U test does show a difference in groups, correct?

Once you rank them, it shows the difference in medians between the two groups?
 

Karabiner

TS Contributor
#4
Once you rank them, it shows the difference in medians between the two groups?
Hopefully so, but not necessarily. The U-test is not a test
for medians. It just tells us whether in one group ranks
tend to be higher than in the other group. Maybe the
description of the Wilcoxon rank sum test (which gives
exactely the same result as the U-test) is a bit more
illustrative than the description of the Mann-Whitney test.

If you want to test the means, then with n > 2000 you
can perform a t-test even if data (within groups) are
non-normal. By the way, I am not sure whether an U-test
could not be affected by markedly different variances.

With kind regards

K.
 

Miner

TS Contributor
#5
The decibel scale is logarithmic, which is why the distributions are non normal and heteroskedastic. You should be able to transform the results and apply a standard test.
 

Karabiner

TS Contributor
#6
But since it is already logarithmic, and measurements are still
non-normal and heteroscedatic - what should usually be done, then?

With kind regards

K.