# Outlier detection via Mahalanobis distance followed by Rosner's test (Generalized ESD test)

#### amir.banaei

##### New Member
Hello all,

I have a question regarding the method I recently used to analyze my data. For some biological reasons we are interested in outliers (in contrast to most situations where we want to get rid of them).

Let's imagine we have the concentration of two proteins in several conditions (can be visualized in a scatter 2D plot where each axis shows the concentration of one of the proteins and dots depicts conditions). I basically compute Mahalanobis distance and then use Rosner's test to identify outliers under alpha = 0.05. It seems that Rosner's test requires "approximately normal distribution" (https://www.itl.nist.gov/div898/handbook/eda/section3/eda35h3.htm) but Mahalanobis distances follow Rayleigh distribution (chi distribution with 2 degrees of freedom). I am wondering if Rayleigh distribution can be considered as "approximately normal distribution". Is this combination (Mahalanobis followed by Rosner's test) sound statistically?

Your help would be highly appreciated.

#### katxt

##### Member
OK. You could check for normality. My impression is that chi 2df isn't normal enough to use a normal based test.
One thought is to rank all the distances and reject anything in the say 0.001 upper tail. That is =CHIINV(0.001,2) in Excel or about 13.8. You would need to choose your own limit which would depend on the number of data points.
Another idea is to plot Q-Q plots for the data against chi 2df and see if any points in the upper tail are well off the straight line. kat