Which test is appropriate for correlating categorical and continuous variables?

#1
Hi,
I'm looking to correlate a categorical variable (genetic risk for dyslexia: yes/no) with a few different continuos variables (rapid naming, letter knowledge, language skills), i.e. I want to know how genetic risk for dyslexia is correlated with the mentioned continuous variables. The data for rapid naming is normally distributed, the rest is not. Total sample size is 40 children. What would be the appropriate test to use? My supervisor told me that Spearman's rho is not appropriate and to use chi square, but after doing some reading it seems that chi square requires two categorical variables and is also not be appropriate. Grateful for any help :)
 

Karabiner

TS Contributor
#4
want to know how genetic risk for dyslexia is correlated with the mentioned continuous variables.
You can perform a series of t tests. These will tell you whether means of these variables differ bteween those with or without risk. You can also check how large the differences are in the sample.
The data for rapid naming is normally distributed, the rest is not.
Irrelevant . It sometimes matters whether data within groups are from normal distributions. But ony with n(total) < 30.

With kind regards

Karabiner
 
#6
You can perform a series of t tests. These will tell you whether means of these variables differ bteween those with or without risk. You can also check how large the differences are in the sample.

Irrelevant . It sometimes matters whether data within groups are from normal distributions. But ony with n(total) < 30.

With kind regards

Karabiner
Hi,
Thanks for your reply! Can I do a t-test with groups of 3 and 37? Unfortunately I only have 3 kids with genetic disposition for dyslexia in my sample.
 

hlsmith

Not a robit
#8
So in the overall order of things the binary marker comes first then the scores, so @Karabiner 's or @staassis suggestions get you the closes - IMHO.

Now you are telling us that you only have 3 subjects in one of the binary groups, whew. From a scientific perspective, what types of generalizations can you make about 3 persons to the rest of the world with the marker. For example, if I have data on three people, will they be a good representation of the population at large?

I would run a permutation test for each of these continuous variables, given the information you provided. You may also want to think about using a lower level of significance for you cut-off to address possible false discovery, since you are running multiple test using the same independent variable.
 
#9
So in the overall order of things the binary marker comes first then the scores, so @Karabiner 's or @staassis suggestions get you the closes - IMHO.

Now you are telling us that you only have 3 subjects in one of the binary groups, whew. From a scientific perspective, what types of generalizations can you make about 3 persons to the rest of the world with the marker. For example, if I have data on three people, will they be a good representation of the population at large?

I would run a permutation test for each of these continuous variables, given the information you provided. You may also want to think about using a lower level of significance for you cut-off to address possible false discovery, since you are running multiple test using the same independent variable.
Yes, 3 persons in one group sucks, I know. I’d prefer to things quite differently (and get my hands on the additional 40 participants that data has been collected from and of which some probably has the genetic risk), but I’m pretty bound by my supervisor and her thoughts on this :rolleyes: Also she’ll only give me half the data for reasons I don’t quite understand.
 
Last edited: