3 IV, 4 DVs, and a partridge in ...

#1
I am going back and forth between which data analysis would be best to use for my study. Any thoughts and or recommendations would be greatly appreciated.

I am conducting research into the impacts of three independent variables; speaker type, task difficulty, and task format, on the oral speech production of language learners. There are four dependent variables measured; frequency of mid pauses, frequency of end pauses, average number of pauses, and average pause length.
Data has been coded and collected, extreme outliers were removed, and all the dependent variables show reasonable normality of variance, with little skewness or kurtosis. Analysis was conducted in R and Q-Q plots were generated for each dependent variable separately for each of the three independent variables. The data set comprises 80 participants (40 of each speaker type) who each provided two speech samples, so a total of 160 different tasks.

I am now trying to decide which statistical analysis to employ to show me clearly the impacts of task complexity (IV) on the pause frequency, and pause length etc. (DVs) but also differentiated by speaker type (IV), and task format (IV). Would it be better to run separate one way ANOVAs for each IV or could they be combined into one statistical test? I have been doing most of my analysis using R if that matters.

Thanks for taking the time to read this and hopefully it is clear, if not I apologise and please ask any questions you like.
 

Karabiner

TS Contributor
#2
Admittedly, I do not fully understand your study design and your experimental procedure.
Is it a mix of between-subjects and wthin-subject factors?

Data has been coded and collected, extreme outliers were removed,
Why this? You can do this if a data point is obviously invalid, but usually not just because
a data point looks "extreme" in someone's view.

I am now trying to decide which statistical analysis to employ to show me clearly the impacts of task complexity (IV)
Where does task complexity come into play, you only mentioned "speaker type, task difficulty,
and task format"
as independent variables?

With kind regards

Karabiner
 
#3
Hi karabiner,

Thank you for your reply.

The study does contain between-subjects and within subject factors as both participant types (IV - 'speaker type' - L1 or L2) performed both a more complex and a less complex task (IV - 'task complexity' +/- ), as well as a written task and pictorial task (IV - 'task format' - pictorial/written).

The data points removed were not invalid, but outliers were removed to meet the assumptions for the requirements of the statistical tests (MANOVA, ANOVA), as they were non-normally distributed according to the Shapiro-Wilk p-values calculated.

'Task complexity' is the correct IV not 'task difficulty' as I incorrectly stated.

Thanks again for your reply.
 

Karabiner

TS Contributor
#4
both participant types (IV - 'speaker type' - L1 or L2) performed both a more complex and a less complex task (IV - 'task complexity' +/- ), as well as a written task and pictorial task (IV - 'task format' - pictorial/written).
So each participant perfomed 4 tasks in total? Did you randomize the sequence of the 4 tasks,
in order to control sequence effects?

The data points removed were not invalid, but outliers were removed to meet the assumptions for the requirements of the statistical tests (MANOVA, ANOVA), as they were non-normally distributed according to the Shapiro-Wilk p-values calculated.
ANOVA or MANOVA do not require normally distributed dependent variables. They might assume
normally distributed residuals, but only if sample size is quite small (n < 30). Apart from
that, removing data just to meet assumptions of some statstical procedure seems absolutely
inadequate.

'Task complexity' is the correct IV not 'task difficulty' as I incorrectly stated.
You could maybe analyse the dependent variables seperately, each with a
"mixed" ANOVA with the within-subject factor speaker type and the
two within-subject factors task complexity and task format. Or, if the 4
DVs jointly represent a hypothetical construct, you can include all of
them, using "response type" as additional within-subject factor.

With kind regards

Karabiner