Strange results of logistic regression

Member
Hi everyone,
I have strange results of logistic regression:
The model is following:
dependent: success of treatment, 0 - No, 1 - Yes,
possible predictors: treatment, 1 or 2 (numeric) and sex with 2 different codes, first: (F=0, M=1), second: (F=1, M=0).

If I build regression with treatment alone, then coefficient for treatment is significant (p=0.03).
If I build regression with treatment and sex (both types of codes) without interaction, then coefficient for sex is not significant, coefficient for treatment is significant (p=0.02).

If I build regression with treatment and sex (FIRST type of codes) with interaction, then coefficient for sex is not significant, coefficient for treatment is NOT significant (p=0.38), interaction coefficient is not significant.

If I build regression with treatment and sex (SECOND type of codes) with interaction, then coefficient for sex is not significant, coefficient for treatment is significant (p=0.03), interaction coefficient is not significant.

I don't understand, why significance of treatment depends on which type of codes I use for sex.
Did anyone have to deal with such a situation?
Many thanks in advance!

noetsi

No cake for spunky
There is no reason how you code a dummy variable should have any impact on anything except the sign of the effect. I would check for a coding mistake or maybe a mistake in the data you are using.

If you have interaction then some question the value of interpreting main effects at all. At the least you have to interpret the effect of a main effect at one specific level (only) of the interacting variable. Not at all levels as you have apparently.

Dason

Ambassador to the humans
Is the p-value for the interaction the same in both cases

hlsmith

Less is more. Stay pure. Stay poor.
Results should be comparable. Post code and output for your last two listed models and we will help unpack the results. This is the easiest way for us to see what you are doing and writing about.

Thanks!

Member
There is no reason how you code a dummy variable should have any impact on anything except the sign of the effect. I would check for a coding mistake or maybe a mistake in the data you are using.

If you have interaction then some question the value of interpreting main effects at all. At the least you have to interpret the effect of a main effect at one specific level (only) of the interacting variable. Not at all levels as you have apparently.
I agree that coding should not influence to results, because of it I posted this question. I put codes in 10 hours, and I don't understand what be be mistake in data. Why I should interpret the treatment effect only for one value of sex?

Member
Results should be comparable. Post code and output for your last two listed models and we will help unpack the results. This is the easiest way for us to see what you are doing and writing about.

Thanks!
In 10 hours I post SAS code and output.
Thank you and all responded!

Member
I use SAS University Edition, the code is below:

***********************************************************************************

DATA t2_work;
SET t2_source;
FORMAT Response 1.0;
FORMAT Gender_num_F0M1_ 1.0;
FORMAT Gender_num_F1M0_ 1.0;
IF (responseCategory="PR" OR responseCategory="CR") THEN Response=1; ELSE Response=0;
IF (gender="FEMALE") THEN Gender_num_F0M1_=0; ELSE IF (gender="MALE") THEN Gender_num_F0M1_=1;
IF (gender="FEMALE") THEN Gender_num_F1M0_=1; ELSE IF (gender="MALE") THEN Gender_num_F1M0_=0;

PROC FORMAT;
VALUE Response 0='Non-responder' 1='Responder';
VALUE Gender_num_F0M1_ 0='FEMALE' 1='MALE';
VALUE Gender_num_F1M0_ 0='MALE' 1='FEMALE';
RUN;

/* log regressions */

proc logistic data=t2_work;
model Response (EVENT='1') = Gender_num_F0M1_ TRTPN Gender_num_F0M1_*TRTPN;
run;

proc logistic data=t2_work;
model Response (EVENT='1') = Gender_num_F1M0_ TRTPN Gender_num_F1M0_*TRTPN;
run;

***********************************************************************************

results are in the attached file.

Attachments

• 107.1 KB Views: 5

hlsmith

Less is more. Stay pure. Stay poor.
Well, when dealing with interactions terms it is standard practice to ignore base terms in the model since they are conditional on each other and don't have an independent interpretation. Thus, you shouldn't care about the base terms. We can see the interaction term did not change, which is what we would care about.

The change in the base term is because the base case (reference value) had been switched. So given this the intercept and the TRTPN change since now it is the log odds increase for the other group (base case) and base prevalence (intercept). This may all seem strange at first, but the interpretations are just different for those terms.

noetsi

No cake for spunky
Incidentally to somewhat disagree with my learned colleague hlsmith it is not true that all do not interpret the main effect when interaction is present. It is not uncommon to interpret the main effect at specific levels of the interacting variables (a different form of interpretation admittedly). This is called simple effects by some.

hlsmith

Less is more. Stay pure. Stay poor.
I know where you are going with this @noetsi but can you elaborate how the OP would do this given their example?

Thanks.

noetsi

No cake for spunky
I am not sure how you do this in logistic regression. In linear regression you tell the software to estimate the impact of X1 on Y at some specific level of X2 when X1 and X2 are interacting predictors.

I would have to go back and look at my SAS code to see how you do this.

Member
May be, it would be better to set these predictors as text variables, as we can't order values of each of them? I tried and I received quite consistent results. By default SAS assigned binary text variables with values -1 and 1. I did it using coding: treatment values 1 and 2 as 1 and -1, gender values as FEMALE 1 MALE -1 and conversely: treatment values 1 and 2 as -1 and 1, gender values as FEMALE -1 MALE 1. Then I built the models for all combinations, and all results are the same.

Attachments

• 447.3 KB Views: 0

noetsi

No cake for spunky
"By default SAS assigned binary text variables with values -1 and 1."
From memory that is effect coding. Personally I would use reference coding where the values are normally 0 or 1. This is the most common way analysis is done.