Hello everyone,

New to this forum and kind of wondering why I hadn't thought to search for one before tonight... I hope a light at the end of the 'searching literature tunnel'.

Briefly, I use PLS regression to model spectroscopic measurements of plant material. That's probably neither here nor there.

My question is: How many samples should be in a calibration set? I can't seem to get a consistent answer... it ranges from 2 more samples than you have variables; you just need to cover the range your trying to predict; just see what works for you... etc. Is there a rule buried deep in the mathematics of PLS that is beyond me, or does anyone else every feel they're at the mercy of how a reviewer is feeling that day?

I'm not a statistician, I'm still learning, I'm trying to grasp the language, understand the rules (of possibly. It being any rules). Any input or help in this would be greatly appreciated.

Kind regards.