Cannot use a linear regression if x values were "stages" set by myself?


New Member
Someone can help me with this problem? I did both linear regression and ANOVA but was told that I cannot do both. Do I need to change to Non-parametric tests? Let me explain about my data set.

I have two data set. 1. developmental stages (1, 2, 3, etc.) based on the body characteristics (morphology) of larvae of an insect species 2. the number of days after hatch

Then, I want to see if each develomental stage based on the body characteristics can predict (or have any association with) the number of days after hatch. The data set of "the number of days after hatch" has normal distribution and homovenous variance. So I used "stages" as x values and "the number of days" as y values and did a regression analysis. However, the developmental stages were set by myself based on the body characteristics of each larva, which is of course not really objective. (All stages are distinct enough to separate each other though.) Someone told me that because "stages" were subjectively set, I cannot use any parametric tests and I should choose non-parametric correlation test rather than regression. Is this true???:confused:

Also, I wanted to see if there is any difference in the number of days after hatch among stages and did ANOVA. However, with the same reason above, I was told to change from ANOVA to a non-parametric test.

Please help me if someone has any idea about this as I don't have anyone to ask around. I will really appreciate your help!!

Many thanks.

if the stages are distinct, and you used your knowledge of the stages to number each insect, then I am not sure why this would be subjective. Even if it was subjective, I am not sure why a non-parametric test would be in order. If the dataset is very small, or the data is not normally distributed then a non-parametric test would make sense, but in this case, I would keep it to the regular linear regression. For your X variable, I would dichotomize the stages into separate variables and use all but one in the model (leaving out one of them as a reference). You don't want to keep developmental stages variable (1,2,3,etc) in the model, as it is not a continuous variable, but would be calculated as such by the regression if you leave it intact. Ordinal variables such as yours should be separated into separate categories. If you are using SPSS, you can request the regression to classify the variable as categorical, but I found that sometimes it doesn't work out correctly.

Jenny Kotlerman


New Member
Dear Jenny,

Thanks very very much for your advice! I understand what you mean.
You helped me a lot, and thank you again.