Logistic regression - a very simple question

#1
I have a question regarding the likelihood. I am trying to assess the likelihood of being in a family history positive group (FH+) on alcoholism (0 means no family history of alcoholism, 1 means family history of alcoholism), based on the amount of drinking (a continuous variable). Amount of drinking is standardized, such that 0 is the means with +-1SD as a unit; amount of drinking is continuous.
I get these results from logistic regression:
Beta intercept -1.163,
Beta alcohol consumption -0.535,
OR 0.586.
I have trouble interpreting the results. Is this interpretation correct?

the odds of FH+ decreased by a factor of 0.586 for a unit (i.e., 1 SD) increase in alcohol consumption. In other words, the odds of FH+ increased by a factor of 1.706 (dividing 1 by the OR estimate gives the estimate for the reciprocal odds; 1/0.586 = 1.706) for a 1 SD decrease in alcohol consumption. Or, an adult scoring 1 SD below the mean was 1.706 times more likely to be in the FH+ group than an adult with a score of 0 (the mean)?

Now, I wanna say this, but in the likelihood (probability) language because rarely do people understand ORs. Can someone tell me how to interpret these results in probabilities?
 

hlsmith

Omega Contributor
#2
Yeah your results seem fine. I believe offhand you just take the exp(model parameters) / exp(1 + (model parameter) to get predicted probabilities.

Issue I see is temporal relationship of variables. How can a previous variable be dependent on a future event. Also you can't randomize alcohol consumption, after flipping variable order, if truly trying lay out the question you would want to control for covariate differences between groups! Also check for linear relationship between consumption and logit.
 
#3
thanks hlsmith. the questions from the professor are:

Is there correspondence between a family history of alcoholism and alchohol consumption. In particular, does the likelihood of family history of alcoholism increase when adults have high consumption? Conversely, does the likelihood of a family history decrease in adults with low levels of consumption? In other words, if alcoholism is heritable/environmental/familial, then we should expect there to be less likelihood of family history in low consumpting compared to high consumpting adults.

what is the correct statistical procedure to answer this question if LR isn't?
 

hlsmith

Omega Contributor
#4
And then the professor provided you with the dataset? What class is this for and do they say to use logistic regression in particular. It is just a "weirdly" written question. I would attempt to visualize the data (e.g., box plots for groups). I would also look at histograms for the groups to see what the distributions look like. Another option is using alcohol consumption as a spline term or use a general additive model.


Your attempt will probably suffice.
 
#7
I'm sorry to interrupt, but please allow this one problem:
If alcohol consumption was an ordinal variable with 4 levels (1-4), and we test it in regression using three dummy variables (for 2-4), each odds ratio/regression coefficient/P value would be interpreted in comparison to baseline alcohol consumption.

1. is there a way to conclude about the difference between e.g. level 2 and 3 from such results? (higher odds ratio for 3 than for 2 would suggest stronger effect associated with higher category? regression needs to be repeated with different allocation of dummy variables?)
2. should P values for each dummy variable be interpreted using Bonferroni correction since we actually perform multiple simultaneous comparisons?
 
Last edited:

hlsmith

Omega Contributor
#8
Yes, I would ideally correct my alpha for family-wise error rate (e.g., Bonferroni correction).


If I get time I will follow-up on your other questions.
 

hlsmith

Omega Contributor
#9
I can't think of a test off the top of my head, but may stew on it. I few workarounds:


1. people will enter the ordinal variable into the model as a continuous variable and if significant report a positive trend in variable.


2. Not whether the OR's 95% CI exclude each other or overlap substantially.


3. Convert odds into predicted probabilities with 95% CI and report those.


Though, I will note that with the model you have the coefficients and SEs, so I wouldn't be surprised if there may be a formal test. Though, if you run a formal test, its alpha level may also need to be corrected for false discovery.
 
#10
Thank you for your opinion,

would you adjust P value for 3 comparisons (each group against the baseline) P<0.0167, or for 6 comparisons (each group against each other, like in univariate approach) P<0.008?
 
#11
I am trying to assess the likelihood of being in a family history positive group (FH+) on alcoholism (0 means no family history of alcoholism, 1 means family history of alcoholism), based on the amount of drinking (a continuous variable). Amount of drinking is standardized, such that 0 is the means with +-1SD as a unit; amount of drinking is continuous.
I get these results from logistic regression:
Beta intercept -1.163,
Beta alcohol consumption -0.535,
OR 0.586.


I have trouble interpreting the results. Is this interpretation correct?
I have also trouble in interpreting the results.
Increased alcohol consumption decreases the probability of alcohol problems! There is a negative slope.

It is also known that alcohol protects against heart attack. Time a glas of wine, anybody?

1. is there a way to conclude about the difference between e.g. level 2 and 3 from such results? (higher odds ratio for 3 than for 2 would suggest stronger effect associated with higher category? regression needs to be repeated with different allocation of dummy variables?)
Compare the estimated proportions and use the estimated standard errors and do a z-test.
 
#12
I interpret it as: people with positive family history of alcoholism (those exposed to negative effects of living with alcohol abuser) are less likely to drink
 
#13
What would be appropriate Bonferroni correction in this case, for 3 (every category against the baseline) or for 6 (every category against each other) synchronous comparisons?