+ Reply to Thread
Results 1 to 6 of 6

Thread: Did I do this correctly?

  1. #1
    Points: 2,313, Level: 29
    Level completed: 9%, Points required for next Level: 137

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Did I do this correctly?



    Hi,

    I am trying to forecast electric power demand for natural gas in the near future. A huge driver of electric power demand for natural gas on a monthly basis is weather, specifically Cold Demand Days when you need to use your air conditioning. I have weather forecasts for Cold Demand Days by month for next year.

    Also, over the past 10 years, the annual increase in electric power demand for natural gas has risen 3-5% a year as power plants use more natural gas (and less coal), and I see this trend continuing.

    So…I ran a multiple regression in Excel with my dependent variable as Natural Gas Used Per Day and my independent variables as 1) Cold Demand Days and 2) a Time Variable when starts at “1” in Jan-2001 and ends with “99” in Mar-2009.

    Here are the results:

    Regression Statistics
    Adjusted R Square 0.90
    Standard Error 1.41
    Observations 99

    Coefficients Standard Error t Stat
    Intercept 9.71 0.31 30.88
    Time Variable 0.05 0.00 9.77
    CDD Variable 0.03 0.00 28.76


    I am not a Stats expert, so I wanted to run this by you guys to see if everything made sense. Do you think there's anything I missed? Did I handle the Time Variable correctly?

    Thank you very much!

  2. #2
    Points: 2,313, Level: 29
    Level completed: 9%, Points required for next Level: 137

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts
    And a follow-up question: I'm also trying to run a regression for Residential power demand for natural gas. When I do the same type of multiple regression, my Time Variable only returns a T-Stat of -1.83. Would I be better off not including this variable?

  3. #3
    TS Contributor
    Points: 7,313, Level: 56
    Level completed: 82%, Points required for next Level: 37
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    1,190
    Thanks
    32
    Thanked 132 Times in 101 Posts
    Quote Originally Posted by naturallight View Post
    And a follow-up question: I'm also trying to run a regression for Residential power demand for natural gas. When I do the same type of multiple regression, my Time Variable only returns a T-Stat of -1.83. Would I be better off not including this variable?
    What is the p value associated with the t statistic? (Sorry, I'm too lazy to lock it up myself ) If the p value indicates that the coefficient is not "statistically significant" you *might* want to delete it - though also consider the actual size of the coefficient and how much it adds to the R-squared.

    Overall your analysis seems to make sense, the adjusted R-squared in the initial analysis is very impressive (to a psychology researcher anyway - to us an R-squared of 0.30 for a 2 variable model is cause to jump with joy!)

    In terms of handling the time variable, my main question would be whether the relationship between time and consumption was/is genuinely linear - remember that the regression can only capture linear effects. I.e. the general trend may have been 3-5% increases, but what I'd imagine there'd be in the data (without looking) would be large-ish increases in the early 00's, and then smaller increases or even decreases as the recession took hold.

    As well as the time/linearity issue you probably need to check that the other assumptions of multiple regression are met (or not too badly violated). I'm not sure how easy this is in Excel - you may want to see if you can get access to SPSS, or another specialised statistics program.

  4. #4
    TS Contributor
    Points: 7,313, Level: 56
    Level completed: 82%, Points required for next Level: 37
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    1,190
    Thanks
    32
    Thanked 132 Times in 101 Posts
    Have looked up p for a t value of -1.83 (df 2), 2-tailed p = 0.2087... yeah, definitely not "statistically significant", so you might want to leave the variable out - but again, consider how much it adds to the R-squared. If it's a lot and you have adequate theoretical/literature justification to include it, I think it'd be ok to leave it in. Null hypothesis testing isn't the be-all and end-all, after all!

  5. #5
    Points: 2,313, Level: 29
    Level completed: 9%, Points required for next Level: 137

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Great, thanks CowboyBear. I sincerely appreciate it.

    Actually, according to the Excel regression, the p-value is 0.07 for the time variable...seems decent, no?

  6. #6
    TS Contributor
    Points: 7,313, Level: 56
    Level completed: 82%, Points required for next Level: 37
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    1,190
    Thanks
    32
    Thanked 132 Times in 101 Posts

    Oh! Well, gosh, if the p value is 0.07... I would say that it's close enough to the usual alpha=0.05 for you to do what you feel is right (as long as there are other good reasons to have it in the model). A p value of 0.07 still indicates, after all, a pretty darn low chance that the observed effect would come up if the population effect was zero.

+ Reply to Thread

Similar Threads

  1. Tell me if I'm setting this up correctly please
    By disgurl in forum Probability
    Replies: 1
    Last Post: 06-16-2009, 01:24 PM
  2. Am I using T-Test correctly?
    By zi_pota in forum Statistics
    Replies: 3
    Last Post: 11-23-2008, 04:36 AM
  3. Replies: 1
    Last Post: 10-28-2008, 11:33 PM
  4. PLEASE~ tell me if i'm doing these correctly
    By pluto in forum Probability
    Replies: 0
    Last Post: 08-08-2007, 08:50 PM
  5. Did I do this correctly?
    By Jerryberry in forum Statistics
    Replies: 1
    Last Post: 01-04-2006, 06:52 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats