Lmer Output/: variable re-levelling

drewmac

New Member
#1
Morning/Evening all

I'm having a bit of a time getting the LME4 package to list ALL of my IVs, rather than all-1. By way of example, here is my model/output (testing the effect of a voice on participant answers to yes/no questions)

model = lmer(supportive ~ Voice + question +(1|participant), data=nopleasant, family="binomial")

#Fixed effects:
# Estimate Std. Error z value Pr(>|z|)
#(Intercept) 1.3981 0.2140 6.534 6.42e-11 ***
#Voicefemale 0.5781 0.2340 2.470 0.0135 *
#Voicegaymale 0.5189 0.2333 2.225 0.0261 *
#Voiceolder male 0.2275 0.2205 1.032 0.3022
#Voicestraightmale 0.3760 0.2215 1.698 0.0896 .
#questionenglishofficial -2.2065 0.2213 -9.970 < 2e-16 ***
#questionculture -1.6641 0.2188 -7.605 2.86e-14 ***
#questionequalaccess -1.1896 0.2211 -5.380 7.46e-08 ***

You will see that it lists 4 voices. However, I have 5 voices as my predictor variables. Is there a way of getting R, or lmer, to list *all* the IVs?

When I perform summary (model), I have the same problem. I'd like to list the coeff. and other values of each and every IV.

Much ta for any input.
 

Jake

Cookie Scientist
#2
If your factor has k levels, then it requires only k-1 predictors to represent the effects of that factor. This is why there are only 4 predictors in your model to represent your 5-level factor. The default in R for unordered factors is to use dummy coding, so the level that is "left out" in your model is being used as the baseline category. So the intercept gives the coefficient for that level*, and the coefficients for the 4 predictors give the change in slope as you move from the baseline category to the category represented by that predictor. This is exactly the same as in classical ANOVA models.

*Note that this is somewhat more complicated in your model because you have more than one dummy coded factor. So the intercept is actually the predicted slope when both factors in your model are at their "baseline" levels.
 
Last edited:

drewmac

New Member
#3
Jake,

Thanks much for confirming what I had suspected, and for explaining it well. Would it be useful, in that case, to command R to re-name the intercept as the baseline category, or is that misleading?

I have, oddly, managed to avoid using ANOVAs in my work and gone straight from mann-whitneys and fisher's to using logistic mixed effects regressions. This forum has helped me heaps :)
 

Dason

Ambassador to the humans
#4
It is a little misleading because you have to interpret the intercept differently than the other effects. The intercept is literally the estimate of that baseline category's mean (well it's a little bit different for a logistic regression) whereas the other parameter estimates are estimates of differences from that baseline category.

If instead what you're asking is if you can change which category it calls the baseline then the answer is yes.
 

jpkelley

TS Contributor
#5
If instead what you're asking is if you can change which category it calls the baseline then the answer is yes.
@drewmac...if you didn't know about this already, check out the function relevel(). Not misleading at all to change which level is considered baseline. R just does it alphabetically by default. Obviously, if you had a control treatment AND you wanted to report the model output as a table, then it would be best to set the control level as the baseline.
 

drewmac

New Member
#6
It is a little misleading because you have to interpret the intercept differently than the other effects.
Thanks. In which case, to report the coeff. (and other values e.g. std dv) of the baseline (only baseline by alphabetical default), I would re-run the model using a different variable as the baseline, is that right?

@ jpkelly- thanks for the code. will check it out shortly :)
 

drewmac

New Member
#7
@ JPkelly and Dason. Now that I have learned to re-level my explanatory variables (i.e. choose the baseline intercept), and how to sensibly interpret the output, I'd like to know what the best practice is for choosing the baseline intercept, seeing as different results are returned depending on what you set as the baseline. I am running multiple models for different question groupings (not all my experimental questions follow the same direction; they are grouped by hypothesis direction). Do I pick a baseline and stick with it, or is there a less rigid path to follow?

If I haven't been clear, or you feel a more in depth example would be helpful, please ask.

Much ta
 

Dason

Ambassador to the humans
#8
The results are the same no matter what you choose for your baseline. It's true that the parameter estimates might be different but that's because the parameters are representing different things when you change your baseline. You can still answer all of the same questions in one way or another though. It's just that choosing a certain baseline makes answering some questions easier than other.