# Likert analysis. Chi-square vs t test vs Mann-Whitney-Wilcoxon

#### Burnsie_UK

##### Member
I'm afraid that you have problems. Basically you are trying to discover too much all at once.
If you google multiple p corrections you will find quite a few ways folk have devised to improve on Bonferroni, but realistically they won't help much in your situation.
If you are still in the planning stage, and haven't collected your data yet, you might think about redirecting your efforts into just a few key questions.
Oh, I am well aware I have problems..... anyway, back to the topic at hand

So the survey has gone out and data collected (this is one of the problems have distance supervisors I guess!)

• What’s the max number of tests you could run before the issues we’re talking about?
• Could I run inferential statistics on only some of my questions? It would lead to massive limitations, but surely better than doing none (I’ve seen other surveys where they just report descriptives).
• I could also limit / remove some comparisons. For example, I don’t really need age and years’ experience. I would argue that the latter is more important. Additionally, I was advised to “just put a COVID question(s) in”… it doesn’t really add to the project itself. So could reduce these down as a starter.

So if sample size is large enough, then sex, age, current role, years in role, level of qualification could jointly
be used to predict the other variables, using multiple linear regression, or logistic regression (for yes/no
variables). That would reduce the number of analyses.

Karabiner
the sample is 100 (although I haven't checked how many need to be discarded).

#### Karabiner

##### TS Contributor
For e.g. linear regression, categorical and ordinal variables with k levels are transformed into k-1 dummy variables.
So education high/middle/low would be transformed into 2 variables eductaion_middle 0/1, eductation_high 0/1.
The remaining level does not need a category of its own and serves as reference level.
I'd guess that even though this adds quite a number of predictors to the regression, n=100 should still be enough
to maintain a reasonable ratio of n-per-predictor.

With kind regards

Karabiner

#### Burnsie_UK

##### Member
For e.g. linear regression, categorical and ordinal variables with k levels are transformed into k-1 dummy variables.
So education high/middle/low would be transformed into 2 variables eductaion_middle 0/1, eductation_high 0/1.
The remaining level does not need a category of its own and serves as reference level.
I'd guess that even though this adds quite a number of predictors to the regression, n=100 should still be enough
to maintain a reasonable ratio of n-per-predictor.

With kind regards

Karabiner
okay, this is a bit 1 step forward, 2 back.... but it might be a solation... I’ll get my reading on!

#### katxt

##### Well-Known Member
Instead of p values could you perhaps give confidence intervals with the caveat that there is a chance that the true values may sometimes lie outside these limits. Then the readers can make their own call as to what is important.

#### katxt

##### Well-Known Member
Another suggestion. Split the data in two and do each analysis twice. Accept those which get p<0.05 both times. This will give a false positive one time in 400.

#### Burnsie_UK

##### Member
Another suggestion. Split the data in two and do each analysis twice. Accept those which get p<0.05 both times. This will give a false positive one time in 400.
sorry, what do you mean by "splitting the data in two"?

#### katxt

##### Well-Known Member
Split the 100 subjects at random into two groups of 50. Do all your tests on the first group. Then repeat all the tests which had p<0.05 on the other group. Declare those tests with p<0.05 a second time significant. The idea is that you use the second group as an independent check on the first results.

#### Karabiner

##### TS Contributor
Split the 100 subjects at random into two groups of 50. Do all your tests on the first group. Then repeat all the tests which had p<0.05 on the other group. Declare those tests with p<0.05 a second time significant. The idea is that you use the second group as an independent check on the first results.
I guess she'd need a recognized reference for that procedure?

#### katxt

##### Well-Known Member
Maybe. It sounds like something somebody would have thought of already.
Or just do it and see what the referees say. It seems quite reasonable. What would the likely objections be.
Or write a short paper if there isn't anything published.
Or maybe it turns out that you are no better off.

Last edited:

#### katxt

##### Well-Known Member
A quick simulation shows that in fact there is no particular advantage using the split sample, which probably explains why it's not an established procedure.