We have conducted some work on control samples and have generated data to indicate the natural range of the measurements, with the aim to identify compounds yielding results outside this expected range.

The control dataset is fairly limited due to the time-consuming nature of the work so there is uncertainty in the mean and standard deviation. We generate a single value for each compound screen but the nature of the work means it is reasonable to assume that the variance will be similar.

The way I have approached this is to use a T-statistic for p=0.05 on the control dataset and multiplied this by the sample standard deviation to generate the possible range of the mean of the control. To identify potential hits of a single data point at p=0.1, I have simply doubled this, so anything outside of the range [control sample mean] +- t*2*[sample standard deviation] registers as a 'hit' with 90% certainty. Further work can then be done to investigate these hits.

Is this the correct/a reasonable approach? It's a long time since I've done any real statistics so I feel like I may be barking up the wrong tree with this. Many thanks in advance.