+ Reply to Thread
Results 1 to 2 of 2

Thread: Controlling for a Variable that directly affects both independent and dependent var

  1. #1
    Points: 4, Level: 1
    Level completed: 7%, Points required for next Level: 46

    Thanked 0 Times in 0 Posts

    Controlling for a Variable that directly affects both independent and dependent var

    Sorry this may be a very basic question but I was wondering what statistical method I can use to control for this.

    Let's say I am measuring if X has a strong correlation to leading to Y.

    I measure X, and I measure Y.

    However, lets say that we have another variable Z: The higher the value of X, the higher the value of Z, but Z can also contribute to increasing Y.

    How would I control for Z in this hypothetical study?

    For example, lets say I am measuring concentration of a bacteria in the gut and seeing if it is related to a quantifiable scale in quality of life. Lets say that being overweight also directly increases the scores in the quantifiable scale.

    Lets also say that the more concentration of bacteria in the gut you have, the more overweight you are as well. (So concentration of bacteria leads to both increases in the independent and dependent variable) How do I control for being overweight?

    Thank you very much!

  2. #2
    Points: 6,387, Level: 52
    Level completed: 19%, Points required for next Level: 163
    Junes's Avatar
    Thanked 25 Times in 20 Posts

    Re: Controlling for a Variable that directly affects both independent and dependent v

    What is typically done in this case is adding the confounding variable to the model. So you measure weight and add this to the model. Under certain assumptions, this allows you to look for the relation with the independent variable, controlling for the confounding variable. Assuming linear regression (but the principle is the same with any approach), you might get:

    Y= a + bX + cZ + \epsilon

    Where Y is your dependent, X is your independent, Z is your confounder, a, b and c are coefficients and \epsilon is the error term. Each coefficient gives you the relation between the corresponding variable and the dependent, holding the other variable constant. So mathematically, there is no difference between the two variables. But the results tell you nothing about the causal interpretation.

    For instance, when we speak about causation, there is a difference between a confounding and a mediating variable, which you seems to hint at in your last paragraph. If the influence of the independent variable is hypothesized to go via another variable, then you might want to look at mediation analysis. See here for a nice blog explaining the difference.
    Last edited by Junes; 10-26-2016 at 07:23 AM.

+ Reply to Thread


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Advertise on Talk Stats