Ordinal Logistic Regression

#1
Hello!

I am trying to perform an ordinal logistic regression (or at least I think I am) and I'm a bit stuck.

My variables are all categorical or ordinal:

DV: High, Medium, Low

IV: Parent Gender: 0,1
Child Gender: 0,1

Essentially, I am trying to understand if a parent's gender impacts the rating of their child based on their gender. Ex: If a parent is female (0) do they rate their female (0) children higher than their male (1) children, etc.

I need to determine whether the interaction of these two variables is significant and what the odds ratios are. I can't perform a simple chi square analysis (unfortunately) because my sample size is >10,000.

Any advice on the best way to approach this? I am using R as my primary analysis tool.
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
You could perform four chisq test, each with the same reference group and each stratified by the suspected effect modifier. Though you should be able to just do it using ordinal logistic. Not sure what is the go to package, but would image you could do it with glm.
 
#3
Hi @hlsmith,

Thanks so much for your reply! Could you clarify what you mean by the four chisq tests? I'm not sure I'm following.

I ended up running the model using the polr function (that's what I saw online in a couple of examples). Here's my code and output to make things easier.

Do you have any advice on the best way to interpret this?


Code& Output:

> model <- polr(Rating ~ Parent_Gender + Child_Gender, data = filtered, Hess = TRUE)

> model


Call:

polr(formula = Rating ~ Parent_Gender + Child_Gender,

data = filtered, Hess = TRUE)



Coefficients:

Parent_Gender Child_Gender

0.08276953 0.28389590



Intercepts:

Low |Medium Medium|High

-4.1362880 0.8080127



Residual Deviance: 22908.11

AIC: 22916.11

(903 observations deleted due to missingness)


> summary_table <- coef(summary(model))

> pval <- pnorm(abs(summary_table[, "t value"]), lower.tail = FALSE)*2

> summary_table <- cbind(summary_table, "p-value" = round(pval,3))

> summary_table

Value Std. Error t value p-value

Parent_Gender 0.08276953 0.03739474 2.213400 0.027

Child_Gender 0.28389590 0.03388402 8.378459 0.000

Low|Medium -4.13628799 0.07557562 -54.730458 0.000

Medium|High 0.80801272 0.03439466 23.492386 0.000
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
You made it sound like you were interested also in the interaction term between you two IVs? So this isn't the case?
 
#5
Hi @hlsmith,

Yes, that is what I'm wanting. Or at least to understand if there is a significant difference in how parents rate their children based on the interaction of their genders.

My thought was that OLR would provide me with the log odds, so I could compare the odds of female parents giving their female children high ratings versus their male children, etc. But maybe that isn't the case.

My main reservation for using a chi square analysis, even across multiple tables, is that the sample size is so large that it would detect even small differences in the actual v. expected and give me a significant difference, even when one doesn't really exist.

Am I thinking about this correctly?

Thank you so much.
 

hlsmith

Less is more. Stay pure. Stay poor.
#6
I don' t think your model above includes an interaction term.

Code:
Rating ~ Parent_Gender + Child_Gender + Parent_Gender*Child_Gender
Yeah, I wouldn't use chi-sq either. Effect estimates with confidence intervals are much better!
 
#7
Ah! You're totally right, thank you for catching that in my model. When I re-run the model, I receive the results below, and the AIC is a bit lower (although not by much), which I think is good.

Call:

polr(formula = Rating ~ Parent_Gender + Child_Gender +

Parent_Gender * Child_Gender, data = filtered,

Hess = TRUE)




Coefficients:

Parent_Gender Child_Gender

0.02860946 0.20393190
Parent_Gender:Child_Gender

0.11280902



Intercepts:

Low |Medium Medium|High

-4.1682561 0.7750662



Residual Deviance: 22905.84

AIC: 22915.84

(903 observations deleted due to missingness)

>

> summary_table2 <- coef(summary(model_int))

> pval <- pnorm(abs(summary_table2[, "t value"]), lower.tail = FALSE)*2

> summary_table2 <- cbind(summary_table2, "p-value" = round(pval,3))

> summary_table2

Value Std. Error t value p-value

Parent_Gender 0.02860946 0.05200919 0.5500846 0.582

Child_Gender 0.20393190 0.06321883 3.2258097 0.001

Parent_Gender:Child_Gender 0.11280902 0.07490642 1.5059993 0.132

Low|Medium -4.16825611 0.07876843 -52.9178539 0.000

Medium|High 0.77506618 0.04077741 19.0072439 0.000


So, if I'm interpreting these results correctly, this means that the interaction between a parent's gender and their child's gender does not have a significant impact on the child's rating (high, medium, low).

However, a child's gender does have a significant impact on how they are rated? How is that the case?

I apologize for all of my questions, it's my first time running an ordinal logistic regression and I really want to understand what it means. I'm currently using this article https://stats.idre.ucla.edu/r/faq/ologit-coefficients/ as a reference.

Thank you again!
 

hlsmith

Less is more. Stay pure. Stay poor.
#8
Yeah the UCLA articles and anything by Paul Allison are usually helpful as introductions.

Correct that the interaction does not explain anything additional beyond change, conditional on your sample (which you state is large). The common approach is not to attempt to overly interpret the base terms within an interaction too much in the saturated model, since those variables could be conditional on each other. I don't think this is as much of an issue, given a person understands what all of the model terms are defining.