+ Reply to Thread
Results 1 to 8 of 8

Thread: Clarify this explanation of why we use d.f.?

  1. #1
    Points: 4,820, Level: 44
    Level completed: 35%, Points required for next Level: 130

    Posts
    155
    Thanks
    38
    Thanked 27 Times in 25 Posts

    Clarify this explanation of why we use d.f.?




    I have read some variant of this explanation of d.f.'s many times - I'll quote the latest one that I read:

    Degrees of Freedom: 1-Sample t test

    You have a data set with 10 values. If you’re not estimating anything, each value can take on any number, right? Each value is completely free to vary.

    But suppose you want to test the population mean with a sample of 10 values, using a 1-sample t test. You now have a constraint—the estimation of the mean.

    This explanation would make more sense to me if they said that the constraint arises from using a sample variance to estimate a population variance. That would explain why you use d.f. on one-sample t but not one-sample z.

    But, given that they say the constraint arises from using the sample mean to estimate the pop mean, I can't understand the discrepancy between z test and t-test. Both z-statistic and t-statistic use a sample mean to estimate a population mean.

  2. #2
    Points: 4,820, Level: 44
    Level completed: 35%, Points required for next Level: 130

    Posts
    155
    Thanks
    38
    Thanked 27 Times in 25 Posts

    Re: Clarify this explanation of why we use d.f.?

    I think I kind of get the z/t discrepancy based on the fact that there is only one z (normal) distribution but many t-distributions (depending on n). Since n is fixed for a given t-distribution, that's why there's now a constraint, because you don't just need a certain sample mean, but you need a certain n of scores to equal a certain sample mean.

    But is there a more detailed explanation someone can give me that fleshes this idea out a lot more? Do you simply have to have studied "mathematical statistics" (as opposed to the "practical stats" many students are taught) to really appreciate degrees of freedom at a deeper level than what I've said above?
    Last edited by bruin; 10-05-2017 at 03:30 PM.

  3. #3
    Omega Contributor
    Points: 38,432, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,006
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Clarify this explanation of why we use d.f.?

    Well the variance and mean are both parameters if that makes a difference.


    Your Z and T comparison makes sense to me, but I am "practical".
    Stop cowardice, ban guns!

  4. The Following User Says Thank You to hlsmith For This Useful Post:

    bruin (10-05-2017)

  5. #4
    Points: 4,820, Level: 44
    Level completed: 35%, Points required for next Level: 130

    Posts
    155
    Thanks
    38
    Thanked 27 Times in 25 Posts

    Re: Clarify this explanation of why we use d.f.?

    Thanks hlsmith - I'll only bump this one time, I promise.

    But does anyone have anything to add here? Isn't there any meatier/more-satisfying explanation possible than the one I gave, without recourse to the "mathematical" calc-based stats?

  6. #5
    TS Contributor
    Points: 12,287, Level: 72
    Level completed: 60%, Points required for next Level: 163
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,471
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Clarify this explanation of why we use d.f.?

    hi,
    I am also a purely "practical" guy and to me this is an issue if using the same term (d.f.) in two different ways. For a t-test d.f seems to me to be a simple label of the relevant distribution. I can understand why they call it d.f. but any other name would do (like "underlying sample size"? just an idea). This labelling has IMHO nothing to do with constraints on our data.

    In calculating the variance we do have degrees if freedom instead of sample sizes - because we do have a constraint, the value of the mean.

    I wonder what others think?

  7. #6
    Points: 1,741, Level: 24
    Level completed: 41%, Points required for next Level: 59

    Posts
    230
    Thanks
    37
    Thanked 68 Times in 59 Posts

    Re: Clarify this explanation of why we use d.f.?

    Quote Originally Posted by rogojel View Post
    hi,
    I am also a purely "practical" guy and to me this is an issue if using the same term (d.f.) in two different ways. For a t-test d.f seems to me to be a simple label of the relevant distribution. I can understand why they call it d.f. but any other name would do (like "underlying sample size"? just an idea). This labelling has IMHO nothing to do with constraints on our data.

    In calculating the variance we do have degrees if freedom instead of sample sizes - because we do have a constraint, the value of the mean.

    I wonder what others think?
    If you both are referring to the sample variance being calculated using (N-1) rather than N, this is done to provide an unbiased estimate of the population variance/sd. Using N creates a downward bias in the estimate and (N-1) corrects for that.

    The Z test assumes the population variance is known which means you're technically not "estimating" the variance since it is a known quantity (look at background on this if you're curious). You're also assumed to be working with a "large" sample (ideally infinite, or the size of population, I would imagine).

    Maybe another person can add a bit more or correct anything I noted that's incorrect, gotta run!

  8. #7
    Omega Contributor
    Points: 38,432, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,006
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Clarify this explanation of why we use d.f.?

    d.f., the enigma. Yeah to piggy-back on ondan, when you have a larger sample closer to the population, a minus 1 doesn't mean as much.


    I always try to remember the degrees of freedom that can vary, or the idea that is similar to dummy coding where you only need so many terms to explain all of the model terms and the others are available to vary. I am sure I just butchered that concept.
    Stop cowardice, ban guns!

  9. #8
    Devorador de queso
    Points: 95,995, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,938
    Thanks
    307
    Thanked 2,630 Times in 2,246 Posts

    Re: Clarify this explanation of why we use d.f.?


    One thing to note (and explains the difference between the z-test and t-test) is that the degrees of freedom are associated with the error term. So when you're thinking about how many parameters need to be estimated it's specifically just how many parameters need to be estimated for you to get an estimate of the variance of the test statistic. So for a z-test you *know* the variance already so you don't have any degree of freedom issues to worry about. For a simple one-sample t-test you have to estimate a mean and that mean gets used in the estimation of the standard deviation so that's where you lose your degree of freedom.

    Ultimately the answer is "that's how the math works out" but the intuition which follows the math has you looking at the estimate of the variance and how many independent observations you actually end up with to estimate that variance.
    I don't have emotions and sometimes that makes me very sad.

  10. The Following User Says Thank You to Dason For This Useful Post:

    ondansetron (10-10-2017)

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats