+ Reply to Thread
Results 1 to 3 of 3

Thread: Assessing covariates to include in model

  1. #1
    Points: 8, Level: 1
    Level completed: 15%, Points required for next Level: 42

    Thanked 0 Times in 0 Posts

    Assessing covariates to include in model

    Hi all, I have a couple of basic questions which would be great to get some help on

    I have a continuous outcome (brain volume: total and divided into specific brain regions) and categorical exposure (smoking- 4 categories)
    I am using linear regression to analyse this relationship- so far in my adjusted models not seeing very much!

    I have information on many covariates which are coded in different formats/types; binary(eg sex), categorical(eg education level), and continuous (eg IQ). I want to check associations of these against both my exposure (categorical) and outcome (continuous).

    Should linear regression be used for all of these? Or do I need to use another statistical test when looking at categorical vs categorical data etc ?

    Thank you- any advice would be hugely appreciated, I am very new to epidemiology and biostatistics

  2. #2
    Points: 3,631, Level: 37
    Level completed: 88%, Points required for next Level: 19
    staassis's Avatar
    New York
    Thanked 41 Times in 39 Posts

    Re: Assessing covariates to include in model

    If the sample size is not very large, you should use linear regression where the candidate predictors are the following:

    1) original numeric variables,
    2) nonlinear transformations of the original numeric variables,
    3) binary dummy variables representing each category of the nominal variables (except for the reference categories),
    4) interactions of selected members of groups 1) - 3).

    The rule of thumb says that there should be at least 15 observations per each coefficient to estimate. So not all of the predictors may find their way into any given model. You should use standard model selection protocols (forward stepwise selection, backward stepwise selection, lasso, etc) to determine which predictors should be kept in the final model and which predictors should be dropped.

    You do not have to use ANOVA since it is algebraically equivalent to linear regression. However, choosing the ANOVA option in some statistical packages (SPSS,SAS,...) produces extra, informative output.

    If the sample size is very large and your focus is on the predictive accuracy (not interpretability), you can experiment with data mining methods (like boosted trees, SVM and such).

  3. #3
    Points: 2,462, Level: 30
    Level completed: 8%, Points required for next Level: 138

    Thanked 48 Times in 43 Posts

    Re: Assessing covariates to include in model


    in order to correct the volume for the influence of these additional covariates you should integrate them into your regression analysis. The usual way is that you build different reasonable models

    mod1 <- lm(volume ~ exposure)
    mod2 <- lm(volume ~ exposure + sex)
    mod3 <- lm(volume ~ exposure + sex + exposure*sex)
    mod4 <- lm(volume ~ exposure + IQ)

    and than you can compare them via the AIC-value by


    and the model with the lowest AIC is the model you should choose. It will correct your test for the effect the additional covariates have

+ Reply to Thread


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Advertise on Talk Stats