Interpreting regression coefficients from non parametric regression in R (Rfit)

#1
Hallo,
I need to examine the relationship between an outcome variable (continuous) and a number of predictors. Since my data is non-normally distributed (i.e. the residuals from the multiple linear regression are not) I decided to use a rank-based regression using the Rfit function from the npsm package for R (Kloke and McKean https://cran.r-project.org/web/packages/npsm/npsm.pdf).

I can run the analysis with no problems, but I am struggling with the concept of rank-based correlation coefficients and how you can interpret them. I have read a few scientific articles which use Rfit (e.g. https://doi.org/10.1007/s10286-012-0158-6) and the original paper by Kloke and McKean (https://journal.r-project.org/archive/2012-2/RJournal_2012-2_Kloke+McKean.pdf) but I am still not clear on some issues.

I understand that it is safe to say that a significant negative (or positive) regression coefficient indicates a negative (or positive) association between the predictor and the outcome. But how can I obtain some information on the magnitude of the effect?

I am familiar with the use of unstandardized and standardized beta coefficients in multiple linear regression and with the fact that standardized coefficients are generally considered more appropriate to compare the different effects of the different predictors on the outcome

Would it be stupid to say that the rank-based regression coefficient is the amount of change in the outcome due to a change in one rank of the predictor ? And would this allow me to directly compare different predictors: i.e. would a larger coefficient mean a larger effect?

Thank you so much for any help you can give me!
 

Karabiner

TS Contributor
#2
Since my data is non-normally distributed (i.e. the residuals from the multiple linear regression are not)
Normality of the residuals is commonly considered as by far the least important assumption in linear regression.
Moreover, if sample size is large (n > 30 or n > 50 or so), the tests are robust against deviations from normality
(cf. central limit theorem).
Since you consider several predictors ast the same time, I suppose that your sample size is large enough for
a linear regression analysis.

With kind regards

Karabiner
 
Last edited:
#3
Dear Karabiner, thank you very much. As you suppose, the sample size is large (n around 300). However, I am planning to publish the results from the data and in my experience, journals tend to be a little bit fussy on assumptions. Tests for non-normality like the Shapiro Wilks on the residuals are significant (but I would expect them to be: given the large sample size even small deviations from normality would be significant). The QQ plots are not really bad but there is some deviation from a straight line at the ends and they are often claimed to be too subjective.
I totally agree with you but I was trying to find an easier way out. Also I am really interested in knowing how you can interpret non parametric regression coefficients. Any suggestion?
 

Karabiner

TS Contributor
#4
I totally agree with you but I was trying to find an easier way out.
You find it easier to use "nonparametric regresson" which you do not understand,
instead of referring to the simple fact that with n=300, the question of normality
of the residuals is irrelevant. The journals you are submitting to must be some kind
of strange. No offense meant.

With kind regards

Karabiner
 
#5
You find it easier to use "nonparametric regresson" which you do not understand,
instead of referring to the simple fact that with n=300, the question of normality
of the residuals is irrelevant. The journals you are submitting to must be some kind
of strange. No offense meant.

With kind regards

Karabiner
No offense taken. I do get your point. And I really thank you for following up on my question. I was just wondering if nonparametric regression coefficients could be interpreted in the same way as standardized parametric regression coefficients...I came across non-parametric regression in a certain number of papers in my field and I am now curious about it.
 

hlsmith

Less is more. Stay pure. Stay poor.
#6
I may take a look at this approach later this week, but if i was you i would use LS with robust SEs - study the approach and use it next time. It wasnt transparent what was going on with estimates to me. In the paper on the function they just stated you can interprete coefficients like in LS, but i would have reservations marrying my name to a paper if i didnt quite know what was going on in my analytics. I'll let you know if i surmise any thing later this week!


Thanks.