How many standard deviations to determine outliers

hlsmith

Not a robit
#5
Yes, I agree there is not a standard definition and I will point out that just based on data distributions some values will be say two or three SDs out. I may hold off saying 3 sds is fairly common, since I only know my own field and because I think fields vary in defining what may be considered extreme - say they may want a 1/million or billion percent.
 

Karabiner

TS Contributor
#6
In practice, outliers are defined are those values in a dataset whose removal can
turn an undesired test result ("not significant") into a desired one ("significant").

SCNR
 

hlsmith

Not a robit
#7
Agreed, if you are just looking for a value that is say the 99 percentile, well - you are going to find one. On the other hand, if you are looking for a value with a different data generating process or that is erroneous - these are different.

@Karabiner I was kind of thinking of the opposite scenario. Where faulty or different DGP derived observation, removed then makes two groups more comparable or not a different. But as eluded to, there are many definitions and rationales for sleuthing them out.