+ Reply to Thread
Results 1 to 7 of 7

Thread: Help in deciphering output from a regression

  1. #1
    Points: 556, Level: 11
    Level completed: 12%, Points required for next Level: 44

    Posts
    16
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Help in deciphering output from a regression




    **** the spaces between the values are missing in this post, please refer to the text file attached.

    Hi,

    I'm trying to do some regression analysis and I do not usderstand the output.

    I have run a simpler analysis so I hope someone can help me make sense of the output.

    I am trying to get n and e in the following equation

    x = n y + e

    The data I ran is:

    x y
    56 89
    45 89
    12 56
    12 263


    The code I used is

    proc reg data=Try1;
    model x= y;
    run;

    The output I get is

    The REG Procedure
    Model: MODEL1
    Dependent Variable: x x

    Number of Observations Read 4
    Number of Observations Used 4


    Analysis of Variance

    Sum of Mean
    Source DF Squares Square F Value Pr > F

    Model 1 279.11433 279.11433 0.44 0.5747
    Error 2 1263.63567 631.81783
    Corrected Total 3 1542.75000


    Root MSE 25.13599 R-Square 0.1809
    Dependent Mean 31.25000 Adj R-Sq -0.2286
    Coeff Var 80.43516


    Parameter Estimates

    Parameter Standard
    Variable Label DF Estimate Error t Value Pr > |t|

    Intercept Intercept 1 44.02699 22.96735 1.92 0.1953
    y y 1 -0.10283 0.15472 -0.66 0.5747


    I also plotted the graph in Excel and I got

    y = -1.7594x + 179.23
    R = 0.1809
    from Excel.

    I see the R value is the same but I don't know which of the output from SAS is the n and e.

    I would appreciate any help in deciphering the output.

    Thank you.



    -Connor
    Attached Files
    Last edited by connor; 01-26-2012 at 03:24 AM. Reason: spaces missing, attached text file

  2. #2
    Points: 2,626, Level: 31
    Level completed: 18%, Points required for next Level: 124

    Location
    Dallas, TX
    Posts
    311
    Thanks
    12
    Thanked 94 Times in 93 Posts

    Re: Help in deciphering output from a regression

    Connor, what you're trying to accomplish isn't very clear. I've 2 quick observations here:
    1) The model that you're estimating in excel is different from the one in SAS. They are actually the same models but equation is rearranged & hence you can't expect the coefficients to be same. The excel model is y=b+mx & SAS model is x=a+ny.

    2) 'e' usually stands for error term. If you really mean error when you say 'e' then you're estimating a wrong model in both the cases. If your 'e' denotes the intercept then it is fine.

    Assuming that 'e' is for intercept, then from SAS output your equation is x=44.02-0.10283y. The parameter estimate of y gives you 'n' & that of intercept gives you 'e'.

    If your 'e' is for error & not for intercept then you need to estimate x=ny. You can do this by using NOINT option in the model statement:
    proc reg data=Try1;
    model x= y /NOINT;
    run;

    This will estimate the model without the intercept i.e. x=ny. The you can get error term 'e' for each observation by subtracting predicted value from the observed value.

    Caution: your sample size is too small for any practical application.

  3. The Following User Says Thank You to jrai For This Useful Post:

    connor (01-26-2012)

  4. #3
    Points: 556, Level: 11
    Level completed: 12%, Points required for next Level: 44

    Posts
    16
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: Help in deciphering output from a regression

    Hey Jrai,
    Thanks for responding!!

    1) You are right, my bad. I want x = n y + e

    2) I meant e as in error term

    Does the NOINT command force the intercept to be the origin?

    After I run the code you gave I get (text file attached)

    The REG Procedure
    Model: MODEL1
    Dependent Variable: x x

    Number of Observations Read 4
    Number of Observations Used 4


    NOTE: No intercept in model. R-Square is redefined.

    Analysis of Variance

    Sum of Mean
    Source DF Squares Square F Value Pr > F

    Model 1 1863.65377 1863.65377 1.56 0.3003
    Error 3 3585.34623 1195.11541
    Uncorrected Total 4 5449.00000


    Root MSE 34.57044 R-Square 0.3420
    Dependent Mean 31.25000 Adj R-Sq 0.1227
    Coeff Var 110.62541


    Parameter Estimates

    Parameter Standard
    Variable Label DF Estimate Error t Value Pr > |t|

    y y 1 0.14540 0.11644 1.25 0.3003


    Does that mean x= 0.14540y?

    If so, the expected y values are

    Expected y
    8.1424
    6.543
    1.7448
    1.7448


    So the error is

    y minus expected y
    80.8576
    82.457
    54.2552
    261.2552

    which sums up to 478.825.

    so the final equation is x= 0.14540y + 478.825? (I'm fairly sure I'm very off)

    My actual dataset has 3000 values and 4 variables but I figured I'd use a smaller set to understand what I'm doing.

    Hope you can clarify my confusion again.

    Thank Jrai!!


    -Connor
    Attached Files

  5. #4
    Points: 2,626, Level: 31
    Level completed: 18%, Points required for next Level: 124

    Location
    Dallas, TX
    Posts
    311
    Thanks
    12
    Thanked 94 Times in 93 Posts

    Re: Help in deciphering output from a regression

    Yes NOINT command forces the intercept to be origin, which is not at all a good idea for elementary model until & unless you exactly know that the true specification of your model doesn't have an intercept.

    Connor, I'd recommend reading some text on OLS/ linear regression analysis. You've to get your basics right (no rudeness meant). The error term varies for each observation & you can't just sum up the error terms. Usually the error term is not specified in the equation. You just leave the equation at E(x)=0.14540y. Error/ residual is the random component which by the assumptions of OLS has to have mean=0.

    And for calculating e for each observation, you do x-ny. The observed value of x is x & predicted value of x is ny i.e. 0.14540y. To make things easier you can do this all in SAS:

    proc reg data=Try1;
    model x= y /NOINT P R;
    run;

    The P option displays a variable called Predicted values which will give you ny for all observations. The R option will give you the residual/'e' for each term i.e. x-ny. When you do this for 3000 observations I'd recommend not to display the output but to take it in a dataset by following way:

    proc reg data=Try1;
    model x= y /NOINT;
    output out=try2 r=resid p=pred;
    run;

    This will save the predicted values(in variable pred) & residual values(in variable resid) in dataset named try2.

    It'll be good to understand the difference between error & residuals: http://en.wikipedia.org/wiki/Errors_..._in_statistics

  6. #5
    Beep
    Points: 63,230, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Discussion EnderPosting AwardCommunity AwardMaster TaggerFrequent Poster
    Dason's Avatar
    Location
    Ames, IA
    Posts
    11,313
    Thanks
    266
    Thanked 2,202 Times in 1,881 Posts

    Re: Help in deciphering output from a regression

    Quote Originally Posted by jrai View Post
    Yes NOINT command forces the intercept to be origin, which is not at all a good idea for elementary model until & unless you exactly know that the true specification of your model doesn't have an intercept.
    Even then you might want an intercept.

  7. The Following User Says Thank You to Dason For This Useful Post:

    jrai (01-26-2012)

  8. #6
    Points: 2,626, Level: 31
    Level completed: 18%, Points required for next Level: 124

    Location
    Dallas, TX
    Posts
    311
    Thanks
    12
    Thanked 94 Times in 93 Posts

    Re: Help in deciphering output from a regression

    Dason,
    Thanks for this article. A quick question:

    I remember a model where I was predicting sales. Due to some issues I was restricted not to transform the response variable. The intercept was negative (~=-1000) & this led to many predicted values to be negative (approx. 1000-2000 or maybe even more out of 700,000 subjects). My idea was to restrict the intercept to be positive (I expected this would lead to all positive predictions). Is it a good idea? If not any alternate suggestions to keep the negative count low.

  9. #7
    Beep
    Points: 63,230, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Discussion EnderPosting AwardCommunity AwardMaster TaggerFrequent Poster
    Dason's Avatar
    Location
    Ames, IA
    Posts
    11,313
    Thanks
    266
    Thanked 2,202 Times in 1,881 Posts

    Re: Help in deciphering output from a regression


    Did a normal response seem appropriate? Was there any issue with non constant variance? Some sort of GLM using something like a gamma response might work?

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats