+ Reply to Thread
Results 1 to 4 of 4

Thread: Problem interpreting Multiple Regression on Video Game Sales

  1. #1
    Points: 2,021, Level: 27
    Level completed: 14%, Points required for next Level: 129

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Exclamation Problem interpreting Multiple Regression on Video Game Sales



    Hi everyone,

    I am a new member, I am so glad to have found the forum. I'm tearing my hair out about this problem. These are my results:



    "Views" = Hits on our company Facebook page.
    "Reviews" = Number of 3rd party reviews published about our game that link to our website.
    "Coupon" = Discount coupons for the game.
    "Shareware" = Unique listings of our product on Shareware websites that link to our website.
    "Price" = Pricing of our product. The product pricing was changed three times, starting at $19.99, down to $9.99, and then $6.99.
    "Downloads" = Downloads of the trial version of the product from our website.

    Am I right in interpreting this data to say that with an increase of 1 game review, product units sold will decrease by -6.94? Also, it is true that increased price decreases total unit sales, but the amount it decreases by seems larger than reality. Is the "Day" value skewing the results?

    Here's how I got the results using R:

    results <- summary(lm(Sales ~ Downloads + Price + Coupon + Shareware + Views + Reviews + Day))
    , where "Sales" is the dependent variable and the other factors are independent variables.


    Can someone tell me what I could have possibly done wrong to produce these counterintuitive results? Anyway ways to test their validity? I am so sorry if these question are stupid. I have to learn multiple regression for work, with very little background in math, and no teacher! Any help is appreciated, thanks!
    Last edited by aplfalcon; 11-09-2009 at 09:29 PM. Reason: Clarification

  2. #2
    TS Contributor
    Points: 3,913, Level: 39
    Level completed: 76%, Points required for next Level: 37
    terzi's Avatar
    Location
    Poza Rica, Mexico
    Posts
    378
    Thanks
    2
    Thanked 25 Times in 25 Posts
    Hi aplfalcon,

    There are certain things that could be be producing wrong results:

    * Some variables in the model appear to be non-significant, that is, these measures don't help you explain sales. Before adjusting the model, try analyzing the relationships with scatter plots and correlations.

    * The Response Variable may have some skew (when dealing with money it is very common), you should analyze it first.

    * Certain assumptions may not be met in your model. The most important are normality in residuals and common variance across the observations. Those assumptions must be checked graphically.

    * Since your response variable seems to be measured over time, your data may be autocorrelated, which will cause troubles in an Ordinary Least Squares Regression Model.

    As you can see, your data appears to be a little tricky. Try doing a deeper exploratory analysis and I would suggest you to perform a Robust Regression Model.
    Statisticians are engaged in an exhausting but exhilarating struggle with the biggest challenge that philosophy makes to science: how do we translate information into knowledge

  3. #3
    Points: 2,021, Level: 27
    Level completed: 14%, Points required for next Level: 129

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hi Terzi, thanks so much for your reply. It's really helpful.

    I noticed that some of the variables do appear non-significant, but can they be significant in large numbers? 1 download doesn't do much for sales, but my data set includes thousands of downloads per day. Same goes for views.

    I did get the sense that the time variable did not fit in the model. Thanks for suggesting the Robust Regression model, I will look into it. Also, are "residuals" the same as minimum sum of squared errors (SSE)? And when you say, "normality," do you mean a normal dstribution? Thanks for your help.

    It looks like I'm going to have to take a statistics class next quarter. There are so many things to learn, I don't think I can learn it all on my own.

  4. #4
    TS Contributor
    Points: 3,913, Level: 39
    Level completed: 76%, Points required for next Level: 37
    terzi's Avatar
    Location
    Poza Rica, Mexico
    Posts
    378
    Thanks
    2
    Thanked 25 Times in 25 Posts

    If a variable is non-significant in the model it means it has no linear relationship with the response. Downloads it's significant although views doesn't seem to be. But before making conclusions, you should analyze the relationships individually first, in order to detect their shape, direction and strength.

    Now, regarding the assumptions of the model, with normality I do refer to the normal distribution of your residuals. Residuals are not the same as your SSE, a residual is the difference between the value in your DV predicted by the model and the actual value. There is a residual for every observation and it is assumed that these residuals are distributed normally.

    I'm almost certain that the model you fitted is not meeting all assumptions since that is the most common reason for "weird", illogical results.

    It is indeed a great field, so if you have the opportunity to take a statistics course that will be really helpful. Of course, feel free to come for any doubts you may have.

    Good luck
    Statisticians are engaged in an exhausting but exhilarating struggle with the biggest challenge that philosophy makes to science: how do we translate information into knowledge

+ Reply to Thread

Similar Threads

  1. Video Game Probability Stats
    By Bugg5287 in forum Probability
    Replies: 0
    Last Post: 06-01-2008, 01:55 AM
  2. Video game equipment
    By chriscj in forum Probability
    Replies: 5
    Last Post: 05-30-2008, 12:00 AM
  3. Multiple Regression - Interpreting t-stat
    By big-b in forum Regression Analysis
    Replies: 1
    Last Post: 08-29-2006, 01:56 PM
  4. How do I estimate sales given a multiple regression?
    By Emawk in forum Regression Analysis
    Replies: 0
    Last Post: 06-05-2006, 06:29 PM
  5. Interpreting interaction terms in multiple regression
    By K Wilson in forum Psychology Statistics
    Replies: 1
    Last Post: 04-14-2006, 10:10 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats