# Thread: Using statistical methods to find threshold/base line?

I am a newbie to statistics and would like to seek some advice. Are there any statistical methods that can be used to determine the threshold or baseline for a dataset? Threshold/baseline here would act like a flag. What goes over or below it would probably mean that the datapoint is of concern.

Maybe some more details would be useful to know.
What do you mean by dataset here? What was measured, on which scale(s)?
Do you refer to just 1 variable or to several variables? Why do you need
to flag something, what does it mean whether a data point is of concern?
And how large is your dataset?

Maybe some more details would be useful to know.
What do you mean by dataset here? What was measured, on which scale(s)?
Do you refer to just 1 variable or to several variables? Why do you need
to flag something, what does it mean whether a data point is of concern?
And how large is your dataset?

There are no scales. Dataset comprises of monthly volume data (2 columns- date, volume) over a few years.

Was thinking if it is possible to apply stats method to this dataset to derive the threshold or baseline. So can check a particular month volume against this threshold/base line. To determine if that month data is considered high or low

Read up on an Individuals - Moving Range (IMR) control chart. https://en.wikipedia.org/wiki/Shewha..._control_chart
This is an effective tool for separating real changes from random fluctuations.

Read up on an Individuals - Moving Range (IMR) control chart. https://en.wikipedia.org/wiki/Shewha..._control_chart
This is an effective tool for separating real changes from random fluctuations.
Hi nizze,
When you say baseline, you need that quantity for doing forecast (for example of commercial demand in a market)?
Anyway to calculate a baseline like "average value" you can assume that your data follows (or before you can do a test) a normal distribution. Then you can choose an alpha factor (normally a "realistic" choice could be 0,05) which you'll use for determining a confidence interval. For example for alpha=0,05, the interval is
[mean-1,96*std.dev.;mean+1,96*std.dev.].

https://en.m.wikipedia.org/wiki/File:NormalDist1.96.png

With normal hypothesis confirmed you'll have the 95% of probability to finding a new value included in that interval.

For your scope, you can use the interval for determining the "outlier" values (those out of interval).

Hope I was helpful to you (if you still need).

