+ Reply to Thread
Results 1 to 6 of 6

Thread: Cross validation

  1. #1
    Points: 1,765, Level: 24
    Level completed: 65%, Points required for next Level: 35

    Posts
    26
    Thanks
    20
    Thanked 0 Times in 0 Posts

    Cross validation




    Hello,
    I have a basic question about cross validation and regression model. what is the final regression model made from? I mean the final model that we report for example in our research; the model that is made from the whole data set or the model made from the training set??!! and what is the best cross validation type for 32 observations. I am really confused about what the point of cross validation is...!
    Regards

  2. #2
    Pirate
    Points: 15,159, Level: 79
    Level completed: 62%, Points required for next Level: 191
    victorxstc's Avatar
    Posts
    875
    Thanks
    229
    Thanked 332 Times in 297 Posts

    Re: Cross validation

    You should provide more details about your study. Since you are talking about a final model, I assume you are running a step-wise regression. In such a regression, the final model is the model with the best indicators of model fit, determined blindly by the computer algorithm. So it consists of one dependent variable and a number of selected independent variables that are in the highest association with that dependent variable.

    Since you are talking about training and test sets, I assume you want to fit a model and then test its prediction merit. In this case, the model is created based on the training set (and not the test set). What is the model here? A dependent and a number of independent variables, plus their beta values and standard errors. Now you will need to apply this model to the test set and see how effectively can it predict the dependent variable, based on the values assigned to the independent variables.

    Cross validation is a method to determine the predictive value of the model, without any test sets. Therefore, you can use cross validation to optimize the model further (over the training set), before applying it to the test set.

    Again, you should provide more details regarding your study, variables, etc.
    "victor is the reviewer from hell" -Jake
    "victor is a machine! a publication machine!" -Vinux

  3. The Following User Says Thank You to victorxstc For This Useful Post:

    Bahareh (10-01-2015)

  4. #3
    Points: 1,765, Level: 24
    Level completed: 65%, Points required for next Level: 35

    Posts
    26
    Thanks
    20
    Thanked 0 Times in 0 Posts

    Re: Cross validation

    I have 32 observations and 27 independent variables. I ran multiple linear regression analysis with "forward"selection method and I chose the model with high "adjusted R2". this model consists of only 2 independent variables out of 27 initial variables. now, I am asked to do cross-validation to assess the predictive ability of this model. to do this, again I ran another regression analysis, this time only with those 2 variables (Enter method) and selected the option: calculate PRESS (sum of squares of prediction erros) in that each fitted value for PRESS is obtained from the remaining n-1 observations, then using the fitted regression function to obtain the predicted value for the ith observation. alongside with PRESS, the software also calculates Predicted R2. my PRESS value is: 13.31 and R2(pred) is 56.57%. is this the prediction ability of my regression model?
    Last edited by Bahareh; 10-01-2015 at 07:52 AM.

  5. #4
    Pirate
    Points: 15,159, Level: 79
    Level completed: 62%, Points required for next Level: 191
    victorxstc's Avatar
    Posts
    875
    Thanks
    229
    Thanked 332 Times in 297 Posts

    Re: Cross validation

    yes it seems to be the prediction R-squared of your model.

    Other points I can add:

    Your way of selecting the best model based on the adjusted r-squared is an accepted way, although AIC and other parameters are recommended as well. However, the step-wise method (which includes "forward-selection" etc) is not the best method for determining the model. It is strongly recommended to avoid stepwise regression (the one you conducted). Instead, try building the model on the basis of theory (subjective commonsense + literature) and beta values. As I said, computer blindly selects the model for you, and this "blindly" can get very serious sometimes. Though I agree this stepwise method is very commonly practiced.

    Plus, 32 observations is not a good sample size for evaluating 27 independent variables.
    "victor is the reviewer from hell" -Jake
    "victor is a machine! a publication machine!" -Vinux

  6. The Following User Says Thank You to victorxstc For This Useful Post:

    Bahareh (10-01-2015)

  7. #5
    Points: 1,765, Level: 24
    Level completed: 65%, Points required for next Level: 35

    Posts
    26
    Thanks
    20
    Thanked 0 Times in 0 Posts

    Re: Cross validation

    Thanks a lot victor, you've helped many times...

  8. #6
    Pirate
    Points: 15,159, Level: 79
    Level completed: 62%, Points required for next Level: 191
    victorxstc's Avatar
    Posts
    875
    Thanks
    229
    Thanked 332 Times in 297 Posts

    Re: Cross validation


    My pleasure Bahareh
    "victor is the reviewer from hell" -Jake
    "victor is a machine! a publication machine!" -Vinux

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats