data manipulation stata

#1
data dropping stata 2

Hi all,
I have the following panel of firms:
A balanced panel of firms:

FirmID Year Respondvariable
1 0 30
1 1 40
1 2 30
1 3 30
1 4 30
2 0 40
2 1 463
2 2 .
2 3 .
2 4 .
3 0 30
3 1 463
3 2 .
3 3 .
3 4 40

In year 0 all firms either have 40 or 30 on the respond variable. But in the other years it can take on different values: 30, 40, 200, 210, 220, 330, 401, 431, 450, 463, 465 or 590. Now I want to keep only those firms that :
1) either responded 40 or 30 for all 4 subsequent years 1-4,
2) responded 40 or 30 in year 1 (or for year 1&2 or 1,2&3) and 463 in the second year and with a . in the third and fourth years (or 463 in year 3 and a . in year 4 or 463 in year 4). So in above sample I would like firm 3 to be dropped.
As I am relatively new to stata I do not know how to do this.Does anybody have any suggestions on how to do this in stata?

Thanks a lot in advance,

Carmen
 
Last edited:

bukharin

RoboStataRaptor
#2
One easy way would be to -reshape wide-, drop the relevant firms, then -reshape long- to go back to the original long format. Alternatively you could use -by- with explicit subscripting since you know it's a balanced panel.
Code:
* 1st approach
reshape wide Respondvariable, i(FirmID) j(Year)
gen byte tokeep=1 if (Respondvariable1==10 | Respondvariable1==30) & ///
   (Respondvariable2==10 | Respondvariable2==30) & ///
   (Respondvariable3==10 | Respondvariable3==30) & ///
   (Respondvariable4==10 | Respondvariable4==30)
replace tokeep=1 if (your other rules)
keep if tokeep
drop tokeep
reshape long Respondvariable, i(FirmID) j(Year)

* 2nd approach
gen byte tokeep=0
sort FirmID Year
by FirmID: replace tokeep=1 if (Respondvariable[2]==10 | Respondvariable[2]==30) & ///
   (Respondvariable[3]==10 | Respondvariable[3]==30) & ///
   (Respondvariable[4]==10 | Respondvariable[4]==30) & ///
   (Respondvariable[5]==10 | Respondvariable[5]==30)
(etc)
 
Last edited: