Dropping the end percentages of outlying observations

sif

New Member
#1
I know there is some debate as to whether to drop outlying obs, but I have some extreme nonsensical outliers I have been advised to remove, but cannot work out how.

So I want to remove the top .5% of observations from my WAGE variable in the process of cleaning my data. I feel like I have tried everything and trawled the net and I just can't find the STATA coding to do this correctly.

Any advice much appreciated :)
Thanks, Sarah
 
#2
Hi,

First, generate a new variable containing the 95% percentile of WAGE:

egen p95 = pctile(WAGE), p(95)

and then drop those above the percentile:

drop if WAGE > p95

Best,

Etienne
 

sif

New Member
#3
Thanks so much for your prompt response Etienne!
I have actually tried this already, found it in my lengthy internet searches, but it won't work- I get an error:
'p95 not found'
I feel like I have tried everything already, any ideas?
Thanks again, Sarah
 

sif

New Member
#4
Oh no sorry please ignore my last reply I had something funny going on earlier in my coding from sensitivity testing, it worked perfectly!
Thanks so much for your help, much appreciated