moderator vs. mediator variable


I've encountered these terms 100s of times and looked up their definition and tried to understand them at least 15 times. I'm really trying to get them today:

Here's what I've come up with:

can affect direction and size of an effect between 2 other variables (x1 and y) but it you remove the moderator(x2) the effect between x1 and y still remains.

A mediator explains the relationship between 2 variables (x1 and y). If you remove the effects of the mediator (x2) there no longer exists a relationship between x1 and y.

Please critique my thoughts and if you have anything to add that would be helpful please assume nothing about my intellect as I really want to grasp this one.


TS Contributor
if you remove the moderator(x2) the effect between x1 and y still remains.
If there's a dichotomous moderator group1 & group2,
and both groups have the same size, and x1 is
interval scaled and y is interval scaled, and in group 1
the correlation between x1 and y is +0.5 and in
group 2 it is r= -0.5, then I guess if you remove
the moderator, you see no main effect of x1.

Kind regards



Less is more. Stay pure. Stay poor.
Not so familiar with mediator and when it is used, my brain always wants to group it with controlling for other variables, but I am pretty sure I am wrong. On biostatistics we use effect modification kind of like interaction. And I could be wrong, but I believe confounding is then a special case of effect modification.


Less is more. Stay pure. Stay poor.
For me modification alway go back to thinking of Relative Risks (RR) and stratifying on an additional variable and comparing the two RR do compared to each other and to the crude RR. But this term may mean something total different to other areas.


Phineas Packard
If a moderator is present then the relationship between x and y is different for different levels of a moderator M. For example the relationship between reading ability (x) and school engagement (y) may be different for boys and girls (where gender is the moderator m). I like to think of moderation as like a multigroup analysis. For example, if I thought gender was a moderator of some regression effect I was interested in I could run the regression in boys and girls seperately and see if they were different.

Mediation is about mechanism (m) that explain why x effects y. For example, the reason girls choose to study physical sciences at a lower rate than boys may be due to gendered parental expectations. In other words the relationship x(gender) -> Y(physical science college majors) is the result of a mechanism (i.e. it is not gender itself that leads to lower science uptake among women). Rather the mechanism is X(gender) -> M(gendered parental expectations) -> Y(physical science majors).

EDIT: While I am at it the distinction between moderation and mediation gets a bit grey in causal mediation or conditional indirect effects as these increasingly common models involve both moderation and mediation processes.


Probably A Mammal
lol we just covered this in my psych class the other day. I saw the title of this thread and thought "uh oh! somebody else just started class!"

I haven't seen the statistics of it yet, but moderator is a variable that moderates (changes) the relationship between the response and a predictor (e.g., Y ~ X may be intensified, mitigated, or reversed due to, say, gender if it's added to the model).

On the other hand, a mediator mediates (connects) two variables, and tends to revolve around expanding on the explanatory power of the model (e.g., Y ~ X + M explains the relationship between Y ~ X better than X alone).

At least, that's what I took away from it, and hopefully am not contradicting anyone above! (too lazy to read; i'm going to nap now)


Probably A Mammal
Yeah. M counts as partially explaining some of how X effects Y, which looks like something at a basic level you can include in your model and include an interaction term. Thus Y ~ X + M + X:M. Of course, that page goes into other ways mediators can be involved, but the most basic just looked like Y ~ X*M.


Phineas Packard
For mediation you essentially need the following:

\( Y = \alpha + \beta_{x1} (1)\)

\( M = \alpha + \beta_{x2} (2)\)

\( Y = \alpha + \beta_m + \beta_{x3} (3)\)

Mediation Occurs if \(\beta_{x1}\) is significant in formula 1 but no longer significant (OR in many case we just want it to be smaller) in formula 3 (\(\beta_{x3}\)). If this is the case we can then decompose the effect of x on y into total, indirect (the effect of x on y that is mediated by m), and direct effects (the remaining effect of x on y after the indirect effect is accounted for).

Total effects = \(\beta_{x1}\)
Direct effects = \(\beta_{x3}\)
Indirect effects = \(\beta_{x2}\) * \(\beta_{m}\)

EDIT: Note how this differs from moderation. Moderation is estimated in a single equation
\(Y = \beta_{01} + \beta_{11}x + \beta_{12}z + \beta_{13}xz\)

Plugging in the applicable numbers here will give you the effect of x on y for different values of z. Here is some R code of mine which illustrates the relationship between experience and self-concept for different levels of workplace seniority (i.e. moderation).
#General Workplace self-concept
plot(1, type="n", xlim=c(-3,5), ylim=c(-2,2), xlab="Experience", ylab="Self-concept", main="General Workplace")
curve(exp=-.008*x + .137*-2 + .100*-2*x, from=-3, to=3, add=TRUE)
curve(exp=-.008*x + .137*2 + .100*2*x, from=-3, to=3, lty=3, add=TRUE)
#used locator to get position of text
text(4, .85, "High Seniority (+2 SD)")
text(4, -.90, "Low Seniority (-2 SD)")


Probably A Mammal
So in all 3 of the equations, \(\beta_{xi}\) is the same X, right? So basically Y ~ X is significant but Y ~ X + M is no longer (as) significant in X, due to the fact M ~ X is significant? Isn't that must multicollinearity between two of the predictors? The \(\alpha\) are different in each equation, too, right?


Phineas Packard
You are correct in relation to X and your formation of the other equations. I think it is easier to consider the intercepts as zero in all for now. As for the issue of multicolliniarity you are correct (in that the varibles must obviously be colinear). The distinction to be made is that a theoretical interpretation is applied to this (and all of the associated causal ordering baggage that this brings with it). This theoretical formation is conceptualised as a path model and the relationship between y and x is decomposed as above.
Last edited:


Less is more. Stay pure. Stay poor.
Great thread, I am glad I remembered to come back and review it. I found it very beneficial.

My last two cents on moderation is that it can also be interpreted as additive or synergistic.