Dropping the end percentages of outlying observations


New Member
I know there is some debate as to whether to drop outlying obs, but I have some extreme nonsensical outliers I have been advised to remove, but cannot work out how.

So I want to remove the top .5% of observations from my WAGE variable in the process of cleaning my data. I feel like I have tried everything and trawled the net and I just can't find the STATA coding to do this correctly.

Any advice much appreciated :)
Thanks, Sarah

First, generate a new variable containing the 95% percentile of WAGE:

egen p95 = pctile(WAGE), p(95)

and then drop those above the percentile:

drop if WAGE > p95




New Member
Thanks so much for your prompt response Etienne!
I have actually tried this already, found it in my lengthy internet searches, but it won't work- I get an error:
'p95 not found'
I feel like I have tried everything already, any ideas?
Thanks again, Sarah


New Member
Oh no sorry please ignore my last reply I had something funny going on earlier in my coding from sensitivity testing, it worked perfectly!
Thanks so much for your help, much appreciated