ANOVA n00b questions

#1
Hello! I'm putting together a proposal and would like to conduct a certain type of analysis but my wording on this is rather weak so please correct me when necessary.

I have three independent variables (2 treatment & 1 control) and 1 dependent variable (a test score). N will be 30 or so. I'm planning on doing a one-way ANOVA & Tukey or Tukey Kramer (depending on the actual N) as my post hoc comparisons.

After I collect my data, I would like to conduct a test that looks to see if there is any significance between low and high performers (50% split) based upon the test score. So here are my questions:

1. Is comparing based upon the two groups appropriate?
2. Is this properly called a split-wise comparison?
3. Would the placement in low/high gropuing be a dependent or independent variable?
4. Would I be conducting a two-way ANOVA to do this?

Thank you in advance!
 

trinker

ggplot2orBust
#2
I have three independent variables (2 treatment & 1 control)
You have one variable with three groups.

1 dependent variable (a test score).
test scores are not variable. They're measures of variables. What does the test measure. That's your variable.

After I collect my data, I would like to conduct a test that looks to see if there is any significance between low and high performers (50% split) based upon the test score. So here are my questions:

1. Is comparing based upon the two groups appropriate?
2. Is this properly called a split-wise comparison?
3. Would the placement in low/high gropuing be a dependent or independent variable?
Here you kinda lost me. What you do witht he anova is see if there are any differences in the three groups with regard to 'test score'. If there is you want to know whare. This is where you may use J-1 post-hoc comparisons between groups (basically a t-test) where J is the number of groups you have (so for you: you can do 3-1 =2 comparisons).

As a side note if you have a theory about directionality you may want to forgo the anova and use a planned contrast instead. this increases power but you'd better be sure about directionality of the differences. So, in general, the anova is less powerful than the planned contrast but safer.

I'm a little confused about your design so there may be a better asnwer for what you are planning. If you gave your research questions this would provide more information to help you.
 
#3
Thank you for the feedback! I can't help but laugh at how much time goes into putting together good questions.

I'm not certain of the directionality so ANOVA will probably be my best bet.

Scratch my other questions because in trying to clarify for you I've wiped away a few of the cobwebs in my brain. After removing techno-babble, My question is essentially, "What impact does treatment A (determined by a characteristic representative of the subject) have upon achievement as measured by a transfer test in comparison to treatment B (determined by a characteristic that is not representative of the subject)" I have a couple of questions that follow along the same lines using different forms of assessment.

Now, for my next research question, and this is kind of rough, "What effect does verbal ability (independent variable, high and low at 50th percentile, measured by a different test, e.g., ACT, SAT) have upon achievement as measured by a transfer test?" This question seems more difficult for me to write because I want to include some reference to the treatment groups but it seems to muck it all up.
 

noetsi

Fortran must die
#4
I have several questions there. First, what type of data are your independent and dependent variable in (interval, ordinal, etc). This will play a major role in what method you use to analyze this question. Are different people's ability being measured by different tests (that is some on ACT, some on SAT etc)? If so your results will be confounded. You won't know if verbal ability or the specific test taken to measure this is contributing to the results unless you believe that the test produce identical results (which I doubt they would). Or are you generating your verbal variable by combing results for the same individual from different tests?
 
#5
Sorry for leaving out details. I'm also working on breaking myself of using the term etc. in research!

The independent variable groups are instructional treatments and the dependent variable is interval data.

All participants would have to be scored on the same test. I'm currently looking at the ACT but the university accepts SAT as well. I'll probably have to administer the verbal ability assessment during the data collection or just drop it altogether and keep it simple. But what do you learn from keeping everything simple? Oh yeah, you get the job done I suppose.
 

noetsi

Fortran must die
#6
So your independent variable is a nominal level variable right (with different treatments each one level)? If so I would be inclined to go with ANOVA, but you could run OLS regression as well.

There is a big difference between administering a test yourself or using one they have already taken. If you do it yourself, then something like the ACT takes a long time - you might have trouble getting volunteers to take it. If you use a pre-existing test you should be careful about comparability. If your group took the test at widely different times this may effect how the test results predict your dependent variable (which does not look to me as its part of your design). You also need to be careful ETS did not change the test signficantly in the interm as it sometimes does. The point is you want to focus on the verbal measure, not confound your results with a methods issue related to the test itself.

If you drop the test, how would you measure your independent variable? :)
 

CowboyBear

Super Moderator
#7
The other issues seem well covered so I'll focus on this part:

Now, for my next research question, and this is kind of rough, "What effect does verbal ability (independent variable, high and low at 50th percentile, measured by a different test, e.g., ACT, SAT) have upon achievement as measured by a transfer test?" This question seems more difficult for me to write because I want to include some reference to the treatment groups but it seems to muck it all up.
In general it's almost always a bad idea to take a continuous or approximately continuous variable1 like ACT or SAT verbal ability scores and dichotomise it for the purpose of analysis. You lose information, you lose power; don't do it. If you're interested in how verbal ability scores relate to scores on the transfer test, you can use correlations (bivariate level) or include the verbal ability level scores in your ANOVA (making it an ANCOVA). In the latter case, the effects of group on the dependent variable would represent the effect when controlling for the covariate (verbal ability scores).

1trinker makes an interesting point about what the "variable" is (the observed measurements or the latent variable they're presumed to measure). I'd probably argue that they both can be called variables (one observed, one latent), but I'm not too sure on this.
 
Last edited:

noetsi

Fortran must die
#8
I have never seen ANOVA used for latent variables.

Its a really good point about not taking interval data and turning it into dichotomous data.
 

CowboyBear

Super Moderator
#9
I have never seen ANOVA used for latent variables.
Oh to be sure, ANOVA doesn't involve latent variable modelling. I was more just talking conceptually - we can still think of observed test scores as being indirect measures of a latent (unobserved) variable like verbal ability, even if we're not actually using a latent variable modelling framework. It depends a bit on one's philosophy of measurement, I suppose. :)
 

CowboyBear

Super Moderator
#11
ruh roh. You can't ever prove anything with data, proofs are in maths and baking and alcohol, not science!

(nth time I've used this soundbite)
 

Dason

Ambassador to the humans
#13
Hey now. I can prove lots of stuff with data. For example my best time in minesweeper on this computer on the Intermediate setting is 49 seconds. That proves that I am capable of getting a score under a minute on the intermediate setting on this computer using the equipment at hand. Take that CowboyBear!
 

noetsi

Fortran must die
#14
Hey now. I can prove lots of stuff with data. For example my best time in minesweeper on this computer on the Intermediate setting is 49 seconds. That proves that I am capable of getting a score under a minute on the intermediate setting on this computer using the equipment at hand. Take that CowboyBear!
lol

Only in academic research is correlation not causality. In every non-academic organization I ever worked if the data pointed one way, that was reality. :)
 

CowboyBear

Super Moderator
#15
Hey now. I can prove lots of stuff with data. For example my best time in minesweeper on this computer on the Intermediate setting is 49 seconds. That proves that I am capable of getting a score under a minute on the intermediate setting on this computer using the equipment at hand. Take that CowboyBear!
Oh dear, you're right... Philosophy of science will never be the same again :p
 

Dason

Ambassador to the humans
#16
Hey now. I can prove lots of stuff with data. For example my best time in minesweeper on this computer on the Intermediate setting is 49 seconds. That proves that I am capable of getting a score under a minute on the intermediate setting on this computer using the equipment at hand. Take that CowboyBear!
Actually I should modify that. "That proves that I was at one point capable of getting a score under a minute on the intermediate setting on this computer using the equipment as it was at the time I achieved that score."

I probably didn't cover all of my bases but I realized that just because I got the score once doesn't mean that I'll always be able to achieve it.