Hi people. Can somebody PLEASE help me with this. I've been struggling with this for a week and am getting nowhere. Just a couple of pointers would be nice.
I have a problem that I've been struggling with for some time now. I can't find a standard, non-parametric test that seems to deal with this type of correlational design. Nothing in my big stats/research methods handbook covers it, and I've not been able to get anywhere with searches.
Here's an illustration of a simplified version of two study designs I will be using to assess the way in which a sample of people classify images of facial expressions.
The example I give shows just A SINGLE SET of face images in the form of cartoon faces, but for the actual study I will be assessing a number of sets of such images, each using the face of a separate person.
The objectives are to:
1) assess which pictures are most clearly matched with a single descriptor word.
2) identify which set of pictures, that is, which human face, most successfully displays the full range of expressions.
DESIGN 1 is a forced choice format where a number of participants CHOOSE BETWEEN separate descriptor words, which are then plotted as a frequency grid for each set of photos.
DESIGN 2 is of a Likert scale format, where EVERY DESCRIPTOR WORD is rated on a scale of 0 - 6.
Before I can submit the proposal for formal approval though, I need to detail:
1) the statistical treatment I need to apply to BOTH of these designs that will answer both of these questions
2) do power calculations for the relevant tests to decide what number of participants I will require for each of the experiments utilising these designs.
My earlier approach was based on the approach of splitting my grid into individual columns and applying a chi-square analysis on a row-by-row basis based on comparisons with frequencies that would be expected purely by chance, however that might not be the best way of accomplishing this.
Starting with the frequency grid, is chi-square a viable way of answering my questions?
Last edited by Zenid; 01-29-2012 at 02:16 PM.
Hi people. Can somebody PLEASE help me with this. I've been struggling with this for a week and am getting nowhere. Just a couple of pointers would be nice.
Definitely not for the second grid you have too many cells with zeros.Starting with the frequency grid, is chi-square a viable way of answering my questions?
I think having a look at our FAQ section on "Which test should I use?" may be helpful (LINK) You need to show a bit of intitiative in what approach you're thinking, figure out that tests assumptions etc.
I usually start by looking at my research questions.
then...
- What are my IV(s) and DV(s)
- What type of data do I have? (IVs and DVs)
- At first I used the grids I linked to above to then help me figure out a test but now generally I know from some experience
- Look at my assumptions and see if I meet the intial assumptions of the test (sample size, number in cell, types of data etc)
- Run the model and look at residual diagnostics and see if I am continuing to meet my assumptions (normality etc)
If I'm missing anything fellow TSers please add.
You've done a pretty good job of giving us info but we need to see some initiative so it doesn't feel like we're being your unpaid statistician. We're about helping people to learn so they don't need us next time. If we just tell you a test we won't have accomplished out goal.
"If you torture the data long enough it will eventually confess."
-Ronald Harry Coase -
Zenid (01-29-2012)
For every picture one descriptor word is used most often.1) assess which pictures are most clearly matched with a single descriptor word.
You could rank the pictures according to these. Perhaps you
also have a defintion of a clear match (e.g. "if > 80% of
the population choose one descriptor, then I'll consider this
a clear match".
Kind regards
K.
victorxstc (01-31-2012)
Here's another illustration (SECOND table) showing the scenario broken out by individual posed expressions. Here the "happy" expressions of all the participants are charted against the total scores given by respondents:
This makes it easier to see what I need to do: I need to compare rows and test the hypothesis that "The mean of the happy row is significantly greater than the mean of all other rows".
I've identified the Mann-Whitney Test (ordinal, unrelated) as a way of doing this and plugged it all into SPSS, but all it does is test the hypothesis that there is no significant difference between the "happy" response scores and the others. Naturally this is rejected at a good high p rating, but this tells me nothing about the hypothesis I need to test.
Thanks for the response. I'm only here because I have been smashing my head against a brick wall trying to figure this stuff out. I've read through my "Research Methods in Statistics and Psychology" and trawled the internet for any design analagous to mine and am still not much the wiser. I've finally managed to get the mocked up data into SPSS, but it won't do anything but reject the null hypothesis. I need to know how to get it to give me a comparison grid of all conditions with p-ratings. I know exactly what hypotheses I want to test, but cannot make it fit with anything in my textbooks.
So please rest assurred that I am pulling my weight. I just need guidance on how to get more useful information than what chi-square gives ("yes, the variables are unrelated" - but how and in what direction?) and from what SPSS is now spitting out from my Mann-Whitney analysis (ditto - "the null hypothesis has been rejected").
this sounds like a job for log-linear analysis or probably some implementation of generalized estimating equations for GLMs, given that it seems you'll have to work with a lot of proportions here..
for all your psychometric needs! https://psychometroscar.wordpress.com/about/
Zenid (01-29-2012)
Okay, I've made some progress now: SPSS selects Kruskall-wallis (the non-parametric equivalent of ANOVA) but still gives me the same "null hypothesis rejected" P-rating, which isn't much use to me.
When I delete all but two of my mood conditions, I can put it through the Mann-Whitney to compare condition-pairs (Happy/Surpised etc). This gives me useful and meaningful results. So my problem is how to get SPSS to give me a grid of scores for each combination of my conditions.
Any SPSS boffins here?
So that covers the likert scale response side of things (tables TWO and THREE). Though I still need power calculations to tell me how many participants I need. I'll do some more reading here to try and figure that out for myself...
That leaves the forced choice side which still looks like it wants some kind of chi-square...
Last edited by Zenid; 01-29-2012 at 06:00 PM.
Nope. For the forced choice, it looks like a regular Chi-Square after all. My independant variable is "Mood Category", and my dependant variable is "Mood Classification" (how my participants classify these).
Once I plug them into SPSS I get the following:
The cell residuals (which somebody mentioned) give me the significance levels I need. As I understand it, any value greater than 1.96 gives me significance P<0.05, 2.58 for P<0.01 and so on...
However there is a problem: Each column is actually the SAME 100 participants making a forced choice (denoted by where they appear in a row cell), however Chi-Square "thinks" they're 600 in total. How do I deal with this so that this "related" aspect of the design is taken into consideration?
Last edited by Zenid; 01-31-2012 at 03:54 AM.
I am still wondering what you want to achieve.
Your first question, as stated in your first posting, IMHO
does not require such comparisons as you are proposing.
When inspecting the adjusted residuals from your
(inappropriate, as you noticed) Chi² test, you get the
information that the "happy picture" has been rated
significantly more often as happy than expected by
chance.
Statistical significance means that the difference
between chance rating and actual rating is not
exactely 0.0000... in the population.It does not
mean important or relevant. Just that the associations
found in a sample assumingly are nor completely due
to chance.
In other versions you wanted to test whether e.g. a "happy"
picture was rated as happy significantly more often than
(say) a "sad" picture.
Both approaches might provide you with the information
that ratings are not 100% independent from the content
of the pictures.
But that information does not solve the problem to "assess
which pictures are most clearly matched with a single
descriptor word".
Kind regards
K.
victorxstc (01-31-2012)
It's essentially a correlational design using catagorical data. In psychology, statistical significance is a fundamental criterion on which we assess whether associations can be said to be meaningful, and P-ratings are needed in order to interpret the correlations or differences seen. - the lower the chances of a particular difference being due to random variance, the more significant the finding.
When I say I need to "assess which pictures are most clearly matched with a single descriptor word" I mean that I must show which descriptor or descriptors are selected at a rate significantly above chance expectation for a given stimulus picture. The higher the z-score (cell residual), the higher the p-rating and this gives me an objective basis on which to assess correlations on a column by column basis. This is not at issue, and has already been agreed by my professor and I as the objective of any statistical treatment that I apply.
What IS at issue is the treatment above, which does not take into account the fact that it the design is "related" in the sense that the same participants all assess each and every stimulus picture in a given set.
The obvious way around this shortcoming is to simply treat every column as a separate chi-square instead, but this is not elegant as there are many columns to cover. I was hoping there was some parameter of the SPSS analysis I could tweak to 'tell' it that the same group of participants are distributed about the cells for within each column, and that the analysis should be constrained in this way. Cell residuals will tell me everything I need to know, then.
Last edited by Zenid; 01-31-2012 at 11:19 AM.
Experimental.It's essentially a correlational design using catagorical data.
Not quite. Tests of significance are part of a decision process.In psychology, statistical significance is a fundamental criterion on which
we assess whether associations can be said to be meaningful,
One decides whether the Null hypothesis (the hypothesis to
be nullified) may be retained or can be abandoned. Meanigfulness
of associations or of alternative hypothesis cannot be assessed
through p-values (or adjusted residuals).
I see. As mentioned before, a simple ranking would do the job.The higher the z-score (cell residual), the higher the p-rating
and this gives me an objective basis on which to assess correlations
on a column by column basis. This is not at issue, and has already
been agreed by my professor and I as the objective of any statistical
treatment that I apply.
And if, in additon, a definition exists for clearly matched versus
not clearly matched, one could test whether it is met or not.
Good luck, anyway
K.
victorxstc (01-31-2012)
Yes they can. That's the whole point: When you reject the null hypothesis, you are accepting the experimental hypothesis, namely that there IS a significant difference between variables or conditions that can be used in support of or against a theory you hope to test (assuming you designed the study right).
A simple ranking will not do the job, because it tells you nothing about the statistical significance of differences between conditions in the data. Ranking just tells me that one mean (or median) is greater than another, not about probabilities from which effects can be inferred.
Thanks for your efforts, but it looks like I'm going to have to go and bother the postdocs about this instead.
No, this is false (or at least: when some psychologists interpret p values this way, they are seriously mistaken). I'm in psych myself, btw. A p value simply indicates that probability of observing a statistic as extreme as that observed given that the null hypothesis is exactly true. It says almost nothing about the meaningfulness of the finding. A correlation of 0.01 between two variables might be statistically significant if you have a big enough sample, but it may be so small as to be practically meaningless.
If you have not yet come across any of the number of articles in psychology pointing out why p values do not tell us much about meaningfulness, I'd suggest starting with The Earth is Round (p < .05) by the brilliant Jacob Cohen.
The p value doesn't really tell you this either. The probability of observing a statistic as or more extreme than that observed given that the null hypothesis is true is not the same thing as the probability that the finding is due to random chance.the lower the chances of a particular difference being due to random variance
I've deleted your other thread in the psych subforum, by the way - no need to duplicate. Most regulars use the Latest Posts function so the subforum you choose doesn't really matter.
victorxstc (02-01-2012), Zenid (02-01-2012)
Tweet |