+ Reply to Thread
Results 1 to 6 of 6

Thread: Association between categorical and numerical variables using R

  1. #1
    Points: 113, Level: 2
    Level completed: 26%, Points required for next Level: 37

    Posts
    10
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Association between categorical and numerical variables using R




    Hi,
    I am new to this forum so please be kind :-). Not sure if this belongs in 'R'.
    I will describe my problem as good as I can. First of all I am using R commander for my Master Thesis.
    My Thesis is to identify if there is an association between one predicting variable (categorical 'less than 1', '1-2' ... '10+') and one outcome (numerical 0-60).

    Only for these two variables am I correct to use Generalized linear model, Family 'gaussian'?

    However, I also would like to adjust this association by other variables. All other variables are categorical and differ between 'yes' and 'no', but also 93 different countries. Am I correct to use also here Generalized linear model, Family 'gaussian'?

    Can I use the same model when one predicting variable is numerical? (Predictor = categorical + numerical; outcome = numerical)

    To identify confounding factors I have to test each variable against the predictor and the outcome.

    Am I correct to use the Pearson's Chi-squared test for Categorical <-> Categorical, even so both categories persist not only out of 'yes' and 'no'? Or would I use Generalized linear mode, Family 'binominal'?

    To test an association between one predictor=numerical variable with one outcome=categorical variable, would I use Generalized linear model 'binominal'?

    Thank you for you help!

  2. #2
    Omega Contributor
    Points: 38,334, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Association between categorical and numerical variables using R

    You wording is a little confusing, but seems correct. If outcome is continuous, which you can probably call 0-60 continuous then use family=Gaussian, if outcome is binary use family=binomial. You can use the GLM model, family=Gaussian to examine for confounders as well. Were you thinking of examining interaction terms to do this?
    Stop cowardice, ban guns!

  3. The Following User Says Thank You to hlsmith For This Useful Post:

    MasterStudent (03-10-2016)

  4. #3
    Points: 113, Level: 2
    Level completed: 26%, Points required for next Level: 37

    Posts
    10
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Re: Association between categorical and numerical variables using R

    Sorry if my wording is confusing.

    If my dependent variable is ordinary categorical ('less than 1', '1-2' ... '10+') and the independent variable is categorical, can I still use GLM binominal?

    Sorry, what do you mean with "Were you thinking of examining interaction terms to do this? "

  5. #4
    Omega Contributor
    Points: 38,334, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Association between categorical and numerical variables using R

    Not sure. I don't use R enough to know. It looks like below are the family options in R. Hopefully others speak up, but I wonder if you may need to use another type of model? Search the web a little on ordinal regression in R.


    family(object, ...) binomial(link = "logit") gaussian(link = "identity") Gamma(link = "inverse") inverse.gaussian(link = "1/mu^2") poisson(link = "log") quasi(link = "identity", variance = "constant") quasibinomial(link = "logit") quasipoisson(link = "log")
    Stop cowardice, ban guns!

  6. The Following User Says Thank You to hlsmith For This Useful Post:

    MasterStudent (03-10-2016)

  7. #5
    Points: 113, Level: 2
    Level completed: 26%, Points required for next Level: 37

    Posts
    10
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Re: Association between categorical and numerical variables using R

    Thanks for your help so far. Maybe someone else can help aswell


    Another question. During my analysis if I identify an confounding variable, I can stratisfy. However, what can I do if I identify many confounding variables (2-3)? Is there a way to control? I saw multivariate regression, however, I do not think I can perform that in R commander.

  8. #6
    Omega Contributor
    Points: 38,334, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Association between categorical and numerical variables using R


    To control for a confounder, you just put that variable in the model and it will control for its direct effects and down grade the effects of the other variable. Though some areas can get very complex in models if there is quite a bit of confounding - meaning model interpretations are more difficult.
    Stop cowardice, ban guns!

  9. The Following User Says Thank You to hlsmith For This Useful Post:

    MasterStudent (03-10-2016)

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats