+ Reply to Thread
Page 2 of 2 FirstFirst 1 2
Results 16 to 25 of 25

Thread: What to do when the predictors are not what I expected (when the model is fine)?

  1. #16
    Human
    Points: 12,686, Level: 73
    Level completed: 59%, Points required for next Level: 164
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,363
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: What to do when the predictors are not what I expected (when the model is fine)?




    Quote Originally Posted by hlsmith View Post
    I will save ..... my life and skip reading this monster.
    hlsmith just cut down on the acceptable length of a post.

  2. #17
    Pirate
    Points: 15,159, Level: 79
    Level completed: 62%, Points required for next Level: 191
    victorxstc's Avatar
    Posts
    875
    Thanks
    229
    Thanked 332 Times in 297 Posts

    Re: What to do when the predictors are not what I expected (when the model is fine)?

    Guys, thanks a lot for your replies. As I had stated, I had already solved this problem (luckily), and am going to share here what I did to solve it. It was actually pretty simple.

    In the same Excel file I had attached in the lounge, you can see that I had entered every variable (one by one) into a new model and checked what happens by entering it. In the Excel file, I had highlighted the severe and unfavorable changes happening to the coefficients by red color and had commented them. Noetsi had mentioned the probability of my beliefs being wrong (and that the model is actually pointing to the wrongness of the common held view [theory]). Although I agree that it can happen, it was not the case this time. The correct, not compromised model was consistent with the common held view (except one variable which had a surprising beta, but its beta was super consistent and I had already accepted that). In the Excel file I have highlighted the model with the most desirable result. The problem started when some interactions were added to the model.

    Therefore, all I did was to pin down the problematic interactions. When I detected the first problematic interaction, I looked for its correlation matrix and verified that it had severe correlation with either or both sides of the interaction (or even with other variables). So it was confirmed that it is a case of multicollinearity between that Interaction and the two other variables. Then I removed that problematic interaction from the model, and re-ran my code but with the culprit interaction removed.

    My code entered variables one by one. Now, another problematic interaction emerged some blocks further, and I excluded it and went to haunt down the other ones with the similar method. This way, I managed to excluded five interactions.

    The nice point which makes me super happy is that ALL and each of those problematic interactions that made the model "strange", had high or severe correlations with some of the variables, and the objective protocols pretty well allowed me to remove them for the sake of remedying multicollinearity.

    So there remained a model with almost no severe multicollinearity, and a shining result, for which I am beyond glad. Every beta is the way I want. The coefficient of that single variable is still the opposite of what I expected, but after polishing the model for several days and obtaining many excellent results, I am now extremely sure that that surprising coefficient is OK and I should change my mind according to this finding.

    -----------------------------------------

    @ hlsmith, yeah I can't be brief, but part of it is because my language is not English and I have a very limited set of vocabs and NO idioms to use.

    Besides, I think when I need some help, I should ask for it properly. And I personally consider it "proper" the way I did: by giving every detail the responder might need...

    Many of us are replying to many posters "Could you elaborate more on your problem, so that we could help you better?"... Sometimes being brief is good, but whenever being brief can end to such a request, I think it could be better not to be brief in the first place.

    -----------------------------------

    Anyways, thanks all for taking time to kindly participate.

  3. #18
    Human
    Points: 12,686, Level: 73
    Level completed: 59%, Points required for next Level: 164
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,363
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: What to do when the predictors are not what I expected (when the model is fine)?

    Quote Originally Posted by victorxstc View Post

    the objective protocols pretty well allowed me to ....
    remove them for the sake of remedying multicollinearity.
    Victor has got a problem with objectivity here.


  4. #19
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: What to do when the predictors are not what I expected (when the model is fine)?

    One thing that confuses me victor is that you seem to be addressing multicolinearity and interaction as if they were the same thing. They aren't at all as far as I know.

    But then I am wrong a lot....
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  5. #20
    Pirate
    Points: 15,159, Level: 79
    Level completed: 62%, Points required for next Level: 191
    victorxstc's Avatar
    Posts
    875
    Thanks
    229
    Thanked 332 Times in 297 Posts

    Re: What to do when the predictors are not what I expected (when the model is fine)?

    Greta, no. I didn't go into details. The protocol was to exclude the variables with correlations greater than 0.4 and this is an objective protocol.

    Noetsi, after trinker reminded me of this in this thread, I didn't confuse these two any more. I said there was multicollinearity between the variable D*E and the variables D, E, and C, between the variable C*D and the variables C, and D, etc. Here the variable D*E is an interaction, and this interaction is collinear with D and E... I hope everything is OK, but if I was wrong, please let me know.

  6. #21
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: What to do when the predictors are not what I expected (when the model is fine)?

    Its fine except don't assume bivariate collinearity is the same thing as multicolinearity. You can have the later and show no sign of the former. Since the interaction term is D*E I would think it would pretty likely it was colinear with D or E although I have never seen that addressed.

    And again, when you suspect multicolinearity you should do a VIF or tolerance test (which you probably have - I got confused what had been resolved).
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  7. The Following User Says Thank You to noetsi For This Useful Post:

    victorxstc (02-11-2013)

  8. #22
    Pirate
    Points: 15,159, Level: 79
    Level completed: 62%, Points required for next Level: 191
    victorxstc's Avatar
    Posts
    875
    Thanks
    229
    Thanked 332 Times in 297 Posts

    Re: What to do when the predictors are not what I expected (when the model is fine)?

    Thanks noetsi

    I had read a couple of sources like Wikipedia and understood that there are different ways to confirm multicollinearity. Although VIF was among them, but there were other ways as well. For example:

    Some of the common methods used for detecting multicollinearity include:

    1. The analysis exhibits the signs of multicollinearity — such as, estimates of the coefficients vary from model to model.

    2. The t-tests for each of the individual slopes are non-significant (P > 0.05), but the overall F-test for testing all of the slopes are simultaneously 0 is significant (P < 0.05).

    3. The correlations among pairs of predictor variables are large.
    I was seeing all these three items for example. The third one appeared only when I entered a couple of 3-way interactions, but the first two items were always present.

    There were other items too. For example:

    Detection of multicollinearity

    Indicators that multicollinearity may be present in a model:

    1. Large changes in the estimated regression coefficients when a predictor variable is added or deleted

    2. Insignificant regression coefficients for the affected variables in the multiple regression, but a rejection of the joint hypothesis that those coefficients are all zero (using an F-test)

    3. If a multivariate regression finds an insignificant coefficient of a particular explanator, yet a simple linear regression of the explained variable on this explanatory variable shows its coefficient to be significantly different from zero, this situation indicates multicollinearity in the multivariate regression.

    4. Some authors have suggested a formal detection-tolerance or the variance inflation factor (VIF) for multicollinearity.

    5. Condition Number Test: The standard measure of ill-conditioning in a matrix is the condition index. It will indicate that the inversion of the matrix is numerically unstable with finite-precision numbers ( standard computer floats and doubles ). This indicates the potential sensitivity of the computed inverse to small changes in the original matrix. The Condition Number is computed by finding the square root of (the maximum eigenvalue divided by the minimum eigenvalue). If the Condition Number is above 30, the regression is said to have significant multicollinearity.

    6. Farrar-Glauber Test:[2] If the variables are found to be orthogonal, there is no multicollinearity; if the variables are not orthogonal, then multicollinearity is present.

    7. Construction of a correlation matrix among the explanatory variables will yield indications as to the likelihood that any given couplet of right-hand-side variables are creating multicollinearity problems. Correlation values (off-diagonal elements) of at least .4 are sometimes interpreted as indicating a multicollinearity problem.
    I could verify the items #1, #2, #3, and #7 in my results. I had already pointed to the items 1 and 2, when stating my problem.

    The item 3 was seen when the variable C showed appropriate results in isolation, but showed no significance in the context of the multiple regression.

    For the item 7, I saw huge correlations between 5 of those interactions (but not all of them) with other variables. The correlations between those interactions with the other variables were mostly too high, and all clearly above 0.4.

    Since the interaction term is D*E I would think it would pretty likely it was colinear with D or E although I have never seen that addressed.
    I think not all the interactions are collinear with the variables they are made of. Note that in my own results, except those 5 interactions, other ones did not have high correlations with the variables involved. For example the variable D*H was not necessarily correlated with either D or H, only because it was composed of D and H.

    It was interesting and pretty Convincing that the coefficients suddenly changed directions ONLY and ONLY when one of those collinear interactions were added to the model. This alone was convincing to me (regardless of many other evidences I saw). If the model became strange EVEN after adding a Non-collinear variable to the model, I would doubt my hypothesis. But I did not see any smallest evidences opposing my belief that "it is multicollinearity which is causing problems".

    Therefore by removing those culprit interactions (as one of the suggested and possible approaches to solve multicollinearity), I solved the problem.

    And again, when you suspect multicolinearity you should do a VIF or tolerance test (which you probably have - I got confused what had been resolved).
    Noetsi, I did not dig more for learning and doing tolerance test or VIF, since 1. I already had a handful of convincing evidences, and 2. I did not know I should do VIF/tolerance test, because I saw it is introduced in Wikipedia as something suggested by some authors (not necessarily many authors). So I felt no need for running those tests. But if you have read that it is a strong "should" to run VIF, I would learn and run it too (and i am sure it would again verify my hypothesis).

  9. #23
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: What to do when the predictors are not what I expected (when the model is fine)?

    There are a lot of ways to test for multicollinearity. But I think the best accepted today is either VIF or tolerance. These are actually calculated values with generally accepted levels that indicate multicolinearity while many other detection methodsare more informal (and like changing signs could indicate factors other than multicolinearity).

    In a number of different graduate classes on statistics VIF and tolerance was always stressed as does Fox who seems to be an expert on the topic.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  10. #24
    Pirate
    Points: 15,159, Level: 79
    Level completed: 62%, Points required for next Level: 191
    victorxstc's Avatar
    Posts
    875
    Thanks
    229
    Thanked 332 Times in 297 Posts

    Re: What to do when the predictors are not what I expected (when the model is fine)?

    Thanks. I read about it some more and found it the formal way of detecting multicollinearity as you stated. But they said it is "another" rule-of-thumb method, and that the criteria for considering multicollinearity (VIF > 10) is not provable.

    When I was reading about it, I found something related to my topic, which is collinearity of "interactions" with the original variables [plus other variables]. I did not see many articles about it, thus I am going to share it here. See the page 10 of this good document.

    Another interesting thing was that most of packages did not offer testing for VIF for "logistic regression". Apparently VIF is not so accurate in Logistic Regression, since the variables are weighted in logistic regression. So I had to run a GLM in order to estimate VIF (as an acceptable solution, according to page 12 of this document). But another problem was that most packages did not allow to enter any Interactions into OLS!!

    But anyhow, I finally managed to test VIF for the interactions and guess what happened? Those culprit interactions I was talking about, I mean all of them, had VIFs between 48 to 180.

  11. #25
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: What to do when the predictors are not what I expected (when the model is fine)?


    I am not sure what packages you run, but both SAS and SPSS allow interaction terms. The normal way you run VIF or tolerance for logistic regression is do it in OLS (linear regression). The actual parameters generated (other than the VIF and tolerance) will be nonsense of course, but that does not matter for the VIF/tolerance. Because these are determined by relationship of the IV only, you could run any DV to test the relationship in any form (including the wrong form obviously).

    VIF and tolerance are not rules of thumb. But the level at which MC is significant is because there is no agreement on when exactly it is a major issue. It does not interfere with (bias) the slope estimates. It interferes with the SE and thus the confidence intervals and statistical test.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

+ Reply to Thread
Page 2 of 2 FirstFirst 1 2

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats