+ Reply to Thread
Results 1 to 3 of 3

Thread: Basic regressionquestion

  1. #1
    Points: 49, Level: 1
    Level completed: 98%, Points required for next Level: 1

    Posts
    3
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Basic regressionquestion




    Hi,
    I am currently learning ML algorithms and implementing in R. I have a some basic questions.
    1.)I know that linear regression an statistical model where prior to building model it should satisfy some of assumptions(Hypothesis) like
    >>All the attributes in the dataset must be IID.
    >>Residuals must be normally distributed.
    >>Homoscedasticity among attributes.
    How do i check if attributes satisfy these assumptions prior to building model.
    >>Does doing cor() on attributes and removing the attributes with higher correlation assure my attributes are Homescedastic.
    >>Regarding I.I.D do i need to do t.test() or chi.square among all the attributes?
    I may be wrong in many ways please correct me.
    Sorry, if this is an naive question.
    Thanks

  2. #2
    Omega Contributor
    Points: 38,289, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,992
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: Basic regressionquestion

    Most tests are performed on the residuals from the model.
    Stop cowardice, ban guns!

  3. #3
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Basic regressionquestion


    You check if the residuals are normally distributed best by putting the residuals on a QQ plot. Although there are formal test of homoscedasticity probably the best way is to simply look at the residuals against the predicted values and see if a pattern exist. There should not be one if the assumption of Heteroscedasticity is met. Other assumptions that should be tested are non-linearity, partial regression plots are best for that. Again there probably should not be a pattern. There is no test I am aware of for independence. If you design your analysis correctly this should not occur. An exception is autocorrelation with time series, there are a variety of test for it including Durbin Watson, although best known it is not the best test.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats