+ Reply to Thread
Results 1 to 7 of 7

Thread: Logistic Regression - predicted probabilities opposite to actual percentages

  1. #1
    Points: 637, Level: 12
    Level completed: 74%, Points required for next Level: 13

    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Logistic Regression - predicted probabilities opposite to actual percentages




    Hi! I've asked this question elsewhere but haven't yet got a response that makes much sense to me...

    I have conducted a logistic regression in order to identify whether student status (student/non-student), time period (time 1, 2 or 3), or condition (condition 1 or condition 2) predict a binary outcome (buying lunch or purchasing lunch).

    I have plotted the predicted probabilities that are saved as a result of the logistic regression to visualise the data. These show a decrease in probability of lunch being bought between time 1 and time 2 for one of the conditions. However, when looking at the percentages of people who bought their lunch (rather than the predicted probabilities), there is an increase between time 1 and time 2.

    Is it possible for there to be an increase in terms of percentages but a decrease in terms of probabilities, or does this indicate that something has gone wrong with the model?

    I've been told that it could be Simpson's paradox, that it could be that the model is just a very bad fit, or that it means something has gone wrong along the line somewhere, but I don't know what I need to do to test any of these.

    Is anyone able to help?

  2. #2
    Omega Contributor
    Points: 37,706, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,932
    Thanks
    396
    Thanked 1,179 Times in 1,140 Posts

    Re: Logistic Regression - predicted probabilities opposite to actual percentages

    Was time entered into the model as dummy variables (3) or 2 dummy variables with a reference group? How is it entered into the model. Also you plotted predicted probabilities exported from model, so scored data. I would imagine looking at the actual coefficients would be more telling in the pred probs may be higher or lower due to other factors. Did you test for interactions?


    Can you provide a descript of the model y = bo + b1 +,...bk
    Can you provide the coefficients.


    Simpson's paradox would be the changing in direction of an effect when stratifying it.


    Can you also provide the percentage values for the 3 groups.


    I think we just need a little more info to help. Also can you post a link to your other posts elsewhere, so we can see what feedback you got and info you have provided. Thanks!
    Stop cowardice, ban guns!

  3. #3
    Points: 637, Level: 12
    Level completed: 74%, Points required for next Level: 13

    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Logistic Regression - predicted probabilities opposite to actual percentages

    Hi, thanks so much for your response.

    Quote Originally Posted by hlsmith View Post
    Was time entered into the model as dummy variables (3) or 2 dummy variables with a reference group? How is it entered into the model.
    I had one variable coded 0 (reference category), 1 and 2 - will this work, or should I have used dummy variables?

    Did you test for interactions?
    Yes, that's mostly what's of interest for this study, the interaction between time and condition (and potentially, time, condition and student status). The model includes the main effects of time, condition and student status, and then the interaction terms of time x condition, and time x condition x student status.

    Can you provide the coefficients.
    Here are my coefficients - I've marked which variables are significant (this is a simplified model that doesn't include student status, but the pattern is the same):
    Time (1) -0.361**
    Time (2) -0.172
    Condition(1) 0.734**
    Condition(1) by Time(1) 0.111
    Condition(1) by Time(2) 0.452*
    Constant 0.248**


    Can you also provide the percentage values for the 3 groups.
    I've attached a graph showing the percentage of people purchasing their lunch - the two clusters are the two conditions, with the different colours corresponding with time 1, time 2 and time 3. I've also attached the predicted probabilities for lunch being purchased so you can see what I mean about them being so different (I have also checked that the outcome measure is coded correctly, and hasn't been flipped in the analysis)


    I think we just need a little more info to help. Also can you post a link to your other posts elsewhere, so we can see what feedback you got and info you have provided. Thanks!
    No problem - here's the post I made on stackexchange: https://stats.stackexchange.com/ques...-with-percenta

    Some of the responses sounded very useful but when trying to apply their advice I found it really difficult to follow.

    I hope that helps.
    Attached Thumbnails Attached Thumbnails Click image for larger version

Name:	Picture1.jpg‎
Views:	6
Size:	13.5 KB
ID:	6701   Click image for larger version

Name:	Picture2.jpg‎
Views:	7
Size:	7.5 KB
ID:	6702  

  4. #4
    Omega Contributor
    Points: 37,706, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,932
    Thanks
    396
    Thanked 1,179 Times in 1,140 Posts

    Re: Logistic Regression - predicted probabilities opposite to actual percentages

    Can we see your coding syntax and which group was your reference group, 3?
    Stop cowardice, ban guns!

  5. #5
    Points: 637, Level: 12
    Level completed: 74%, Points required for next Level: 13

    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Logistic Regression - predicted probabilities opposite to actual percentages

    I've just been running it through SPSS so have no syntax I'm afraid - the reference groups are time 1, condition 1 and non-students, all coded as 0.

  6. #6
    Omega Contributor
    Points: 37,706, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,932
    Thanks
    396
    Thanked 1,179 Times in 1,140 Posts

    Re: Logistic Regression - predicted probabilities opposite to actual percentages

    Do your graphs represent this. I would imagine if blue was group one then time(1) in the model would actually be time 2 and that coefficient should be positive?


    Can you better label your graphs?
    Stop cowardice, ban guns!

  7. #7
    Points: 637, Level: 12
    Level completed: 74%, Points required for next Level: 13

    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Logistic Regression - predicted probabilities opposite to actual percentages


    The graphs should represent it - the first graph (the one in which time 2 is in orange) is the percentages (so the percentages increase between time 1 and time 2, and then decrease for time 3), and the second graph is the predicted probabilities, which decrease between time 1 and time 2.

    I hope that makes sense!

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats