+ Reply to Thread
Results 1 to 1 of 1

Thread: Assistance with Linear Regression Model (Model is Not Full Rank)

  1. #1
    Points: 4, Level: 1
    Level completed: 7%, Points required for next Level: 46

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Assistance with Linear Regression Model (Model is Not Full Rank)




    Hello all,

    I'm currently doing a project in a computer statistics class of mine, and I have run into some potential trouble. I'll admit, I'm not as proficient in the ways of statistics as I'd like to be, so I'll try to be as clear as possible, which will hopefully make identifying issues easier. So first, background info:

    My data is on movie stats, which has the following variables:
    Title - Total Gross - Total Theater No - Season - Rating - Genres

    with 'Season', 'Rating', and 'Genres' being separated into individual, binary variables


    With this, I have been trying to make 4 different models - one for each season - from which I can estimate what season would be best to release any given movie, given its rating, genres, and the total number of theaters they plan on releasing it to. In other words,
    Total Gross = B0 + B1(TheaterNo) + B2(G) + ... B5(R) + B6(Action) + ... B13(Romance),
    with each variable outside of TheaterNo being a binary variable.


    It turned out that the Total Gross was not normal, and my professor recommended using a 'Johnson Transformation' in R. So I transformed both Total Gross and TheaterNo with this method, and left the rest alone as they were binary dummy variables.


    I'm currently trying to run this new data set through SAS, and I keep running into issues with my model not being full rank. For example, when running this regression on all movies released in the winter:

    proc reg data=Winter;
    model Total_Gross= Total_Theater_No G PG PG13 RR Action Adventure Comedy Thriller Drama Musical Horror Romance
    quit;
    run;
    ,

    I get:

    Note: The following parameters have been set to 0, since the variables are a linear combination of other variables as shown.

    RR = Intercept - G - PG - PG13


    * I used RR for an R rating

    Can anyone give me any insight into this? If it helps, I have attached two excel files. The first, MovieStatsN, has the original data. MovieStatsT has the transformed data.

    Any assistance is greatly appreciated, and if there's any way I can help make this easier, just let me know!
    Attached Files

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats