Lets say I am estimating a linear regression with Y as a continuous DV and X as a binary IV. Based on theoretical reasons I am assuming that X has a positive effect on Y in the first quartile of the DV and a negative effect in the other quartiles. If I split the sample this assumption is confirmed.

How am I able to model this relationship without splitting the sample? Quantile regression has produced different results than the splitting of the sample. Are there any other options?

Thank you for your help. ]]>

I've completed three test of negative binomial regression. One test had 14 independent variables, and the other two had 7 independent variables each. So in total I've looked at 28 IVs across the three tests.

In terms of risk of Type I errors, is this not really an issue as I've only run three

Any advice would be great!

Thanks ]]>

New to this forum and kind of wondering why I hadn't thought to search for one before tonight... I hope a light at the end of the 'searching literature tunnel'.

Briefly, I use PLS regression to model spectroscopic measurements of plant material. That's probably neither here nor there.

My question is: How many samples should be in a calibration set? I can't seem to get a consistent answer... it ranges from 2 more samples than you have variables; you just need to cover the range your trying to predict; just see what works for you... etc. Is there a rule buried deep in the mathematics of PLS that is beyond me, or does anyone else every feel they're at the mercy of how a reviewer is feeling that day?

I'm not a statistician, I'm still learning, I'm trying to grasp the language, understand the rules (of possibly. It being any rules). Any input or help in this would be greatly appreciated.

Kind regards. ]]>