Thoughts? Thank you!

- Thread starter tess2
- Start date
- Tags categorical percentiles regression

Thoughts? Thank you!

I'd like to convert age and income into categorical variables: low, medium, and high.

create artificial groups, which could be meaningless. Usually, the

interval scaled variable is perfect as a predictor.

I'd like to do this using percentiles, but am not sure if I should use tertiles (lower 33% = low, middle 33% = medium, upper 33% = high) or if I should divide the data into lower 25% (low), middle 50% (medium), and upper 25% (high).

Your defintion will then be sample specific. To wehat could the results

be generalized? The next study with the next sample, or the Population

will have other 33% etc. limits.

Moreover, if most participants are poor, or most of them have medium

income, or most of them are wealthy, you'll define people as

middle/medium/high who aren't.

With kind regards

K.

The only exception I may think of is that you are not disseminating results and it is purely for inhouse use and your sample is pretty complete. But if you are looking to share your results, it can be difficult to generalize results to other samples or populations if the cut rules were developed just using your own sample set.

Also, if there wasn’t a linear relationship between the variables and it was desirable to convert the continuous variable into categories, would you use tertiles or 25-50-25?

Thanks!

The plotting of the relationship between the variables is import in understanding linearity. Options include scattergraphs, loess curves, and general additive models (splines). Tertiles may be dangerous to use in these situations, you want to first determine where changes in slopes occur (knots), and some times just simple piecewise regression or data transformations (logging or polynomials) are good choices. But this is given there isn't a monotonic relationship and not accounting for non-monotoncity would be inappropriate. If there is a linear relationship, moving slowly is fine, but if you have a sinewy or say quadratic going on, you need to address it.