+ Reply to Thread
Results 1 to 4 of 4

Thread: Mallows Cp selection

  1. #1
    Points: 266, Level: 5
    Level completed: 32%, Points required for next Level: 34

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Mallows Cp selection




    I am using the best subsets method in Minitab to select the best parameters for a multiple regression, but I'm a bit confused on Mallows Cp. Some sources I've read say to go with the best adjusted R2 that has the smallest Mallows Cp statistic, but other seem to say to go with the Mallows Cp statistic that has the same number as the number of variables in the regression including the y intercept. I think this is just a wording issue in the various texts, but if I have 5 variables and a choice between a Mallows Cp of 5.1 and a Mallows Cp of 2, which equation is best? The one with the same number of variables, or the one with the lowest statistic?

    Thanks!

  2. #2
    Point Mass at Zero
    Points: 8,373, Level: 61
    Level completed: 75%, Points required for next Level: 77
    ledzep's Avatar
    Location
    Berks,UK
    Posts
    654
    Thanks
    178
    Thanked 134 Times in 132 Posts

    Re: Mallows Cp selection

    There are many diagnostic tests to use when doing all possible regressions. Adjusted R2, Mallow's Cp, and Residual Mean Square are commonly used. All of these three methods can be used in tandem. It is not necessary that these three diagnostics give exactly the same answer al beit they all come in handy.

    but if I have 5 variables and a choice between a Mallows Cp of 5.1 and a Mallows Cp of 2, which equation is best? The one with the same number of variables, or the one with the lowest statistic?
    Let p= number of parameters in model.

    Mallow's Cp statisitc is basically a measure of your prediction error (or bias). Models with small bias will fall near or below the Cp=p line, whereas model with larger bias fall above/way above the Cp=p line.

    So, I would choose a model whose Cp value is approximately close to p.

    In your case: the number of parameters= 5 variables+1intercept=6
    So, p=6. The Cp value of 5.1 is very close to 6 than Cp value of 2. Hence, I would chose the model with Cp value of 5.1.
    Oh Thou Perelman! Poincare's was for you and Riemann's is for me.

  3. #3
    Super Moderator
    Points: 10,311, Level: 67
    Level completed: 66%, Points required for next Level: 139
    Dragan's Avatar
    Location
    Illinois, US
    Posts
    1,846
    Thanks
    0
    Thanked 160 Times in 142 Posts

    Re: Mallows Cp selection

    Ledzep: The OP should really look at the results, seqentially....e.g. the Cp value of 2 could be associated with a regression model with 1 I.V ---I would doubt it, but I suspect you can see my point.

    Basically, note that the default of Cp will indeed be p+1 when you regress on the full model.

  4. #4
    Points: 266, Level: 5
    Level completed: 32%, Points required for next Level: 34

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Mallows Cp selection


    Thanks for the input. This makes it a lot clearer.

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats