# Association between categorical and numerical variables using R

#### MasterStudent

##### New Member
Hi,
I am new to this forum so please be kind . Not sure if this belongs in 'R'.
I will describe my problem as good as I can. First of all I am using R commander for my Master Thesis.
My Thesis is to identify if there is an association between one predicting variable (categorical 'less than 1', '1-2' ... '10+') and one outcome (numerical 0-60).

Only for these two variables am I correct to use Generalized linear model, Family 'gaussian'?

However, I also would like to adjust this association by other variables. All other variables are categorical and differ between 'yes' and 'no', but also 93 different countries. Am I correct to use also here Generalized linear model, Family 'gaussian'?

Can I use the same model when one predicting variable is numerical? (Predictor = categorical + numerical; outcome = numerical)

To identify confounding factors I have to test each variable against the predictor and the outcome.

Am I correct to use the Pearson's Chi-squared test for Categorical <-> Categorical, even so both categories persist not only out of 'yes' and 'no'? Or would I use Generalized linear mode, Family 'binominal'?

To test an association between one predictor=numerical variable with one outcome=categorical variable, would I use Generalized linear model 'binominal'?

Thank you for you help!

#### hlsmith

##### Less is more. Stay pure. Stay poor.
You wording is a little confusing, but seems correct. If outcome is continuous, which you can probably call 0-60 continuous then use family=Gaussian, if outcome is binary use family=binomial. You can use the GLM model, family=Gaussian to examine for confounders as well. Were you thinking of examining interaction terms to do this?

#### MasterStudent

##### New Member
Sorry if my wording is confusing.

If my dependent variable is ordinary categorical ('less than 1', '1-2' ... '10+') and the independent variable is categorical, can I still use GLM binominal?

Sorry, what do you mean with "Were you thinking of examining interaction terms to do this? "

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Not sure. I don't use R enough to know. It looks like below are the family options in R. Hopefully others speak up, but I wonder if you may need to use another type of model? Search the web a little on ordinal regression in R.

#### MasterStudent

##### New Member
Thanks for your help so far. Maybe someone else can help aswell

Another question. During my analysis if I identify an confounding variable, I can stratisfy. However, what can I do if I identify many confounding variables (2-3)? Is there a way to control? I saw multivariate regression, however, I do not think I can perform that in R commander.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
To control for a confounder, you just put that variable in the model and it will control for its direct effects and down grade the effects of the other variable. Though some areas can get very complex in models if there is quite a bit of confounding - meaning model interpretations are more difficult.