Multiple Regression Analysis including Control Variables

Hi Everyone,

Currently I'm working on my masterthesis for which I created a model which was approved by the University. I decided that I want to improve the quality of the model by taking two control variables into account: Age & Organizational tenure. All the variables in this model consist of ratio scales.
The problem that I have now is that I'm struggling with how I would need to test my hypotheses. In this post I've added a picture of my design including the hypotheses.

How should I test the hypotheses and model?

First check if A is related to the outcome variables (E,F,G,H)?

Then test the hypotheses per dependent variable?
For example with Hypothesis 1:
Control Variables --> A --> B --> E (and compare model 1 to model 2 in output)
Control Variables --> A --> B --> G (" ")
Control Variables --> A --> B --> H (" ")

After doing that I would need to test the whole model by testing the dependent variables?
For example with:
Control Variables --> A --> B,C,D --> G

Any advise or feedback would be dearly appreciated since my experience with these kinds of designs is very limited.

Kind Regards,

out of curiosity, I was reading through your post trying to piece it together, but to start. Is this an econometric project? That would at least give me a bit of bearing.


Less is more. Stay pure. Stay poor.
Your illustration almost makes sense to me. Can you describe it in more detail. For example am I correct in stating you are looking at 8 different models, and what do the -/+ represent here (directionality of the relationship)? And for curiousity how are these variables formatted (continuous, categorical, or mix)?
theUP42: thank you for responding. It is in fact psychological research. I will provide more information below
hlsmith: thanks for responding. the -/+ indeed the directionality. The variables are all ratio scales (going from 1 until 5) i believe that would make them continuous? The total model has 8 hypotheses. So doing analysis with control variables would mean that I would have 8 hypotheses with 2 models each? but then I should also test the model as a whole by doing analysis per dependent variable (E until H)?

to give an example of how you could read this design (it's not my design though):
neuroticism (A) leads to increased need for competence (B) need for belonging (C) and lower need for autonomy (D).
All of those (B,C,D) would lead to new outcomes like higher performance and lower burnout.

Since my design has some difficult constructs in it I translated them to A until H.
that is a pretty large chunk to bite out for a master's thesis.

Are you asking how to measure these constructs or what statistical model to use to analyze the data?

If you already have data, then there are a number of ways to analyze depending on your data (depending on what data you have).

If you are asking what kind of data you need to test your H, then that becomes a little more complicated.


Less is more. Stay pure. Stay poor.
So A contributes to B, C, and D which influence E, F, G, H. So you have mediators and potential collinearity?
First of all, thank you for taking the time to read my posts and to respond.
I’m not struggling to find the right way to collect my data, but I’m wondering how to test my hypotheses and model as a whole.
#1: Looking at hypothesis 1: I would have to run three analyses to test this hypothesis. First I’d need to test the relation from A-->B-->E while controlling for age and tenure. For that I’d need to compare the two models in the ‘output’ (to test for the effects of the control variables). The same goes for the other dependent variables G and H. If I’ve done that for all the hypotheses I am wondering if I’d still need to test the model as a whole by doing analyses per dependent variable.
This would mean that I would run an analysis like: A-->B,C,D-->G while controlling for age and tenure.
I believe that my first tests are correct but that last part I don’t know for sure.
Once again thanks for responding and taking your time to help me out.
B,C,D would be (full) mediators of the dependent variables in this model. Therefore I believe that I would expect collinearity (correlations) but not multi-collinearity (since that’s an assumption for regression analysis).

I really appreciate both of you taking your time to help me out on this.
Kind regards and have a nice weekend,
be careful about testing out a model that large as a whole. You can run into issues of model saturation. First thing you probably want to do is get a look at correlations between those constructs.


No cake for spunky
Five levels (likert scale I assume) really is not continuous data although different authors disagree on whether you can treat it as interval like (or even if that concept makes sense at all). The impact on standard errors in SEM is less at five levels than four however.
Hello everyone,
Yesterday I had a conversation with a Professor at the university.
Since you all put effort in to helping me I thought that the least I could do would be to explain to you how I will test my model.

I will run for analyses for each of the dependent variables.
The regression would be as followes:
Dependent variable: Variable G
Box 1: Control variables
Box 2: Variable A
Box 3: Variables B,C,D
Box 4: interactions between A&B, A&C, A&D

Following these analyses I will do ANOVA's to determine the "direction" of the differences.

Thereby I will be able to provide graphs of interaction-effects (if they were to occur).

I'd like to thank you for your time and effort and if you have any questions or remarks concerning this post please respond.

Kind regards,