+ Reply to Thread
Results 1 to 4 of 4

Thread: Deleting data - Panel Data (please help)

  1. #1
    Points: 224, Level: 4
    Level completed: 48%, Points required for next Level: 26

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Deleting data - Panel Data (please help)




    Hi,

    So I have Panel Data (ie. a number of different firms and across time)

    I want to say something like 'if there is missing values for variable x then delete the whole firm', as opposed to just deleting that particular firm-year observation.

    Help?

  2. #2
    Phineas Packard
    Points: 9,932, Level: 66
    Level completed: 71%, Points required for next Level: 118
    Lazar's Avatar
    Location
    Sydney
    Posts
    966
    Thanks
    162
    Thanked 261 Times in 236 Posts

    Re: Deleting data - Panel Data (please help)

    Dont know the answer to this question BUT do not do this!!! This approach to missing data introduces a lot of bias. Please look into modern missing data techniques (i.e. Multiple imputations).

  3. #3
    RoboStataRaptor
    Points: 10,345, Level: 67
    Level completed: 74%, Points required for next Level: 105
    bukharin's Avatar
    Location
    Sydney, Australia
    Posts
    1,295
    Thanks
    11
    Thanked 321 Times in 312 Posts

    Re: Deleting data - Panel Data (please help)

    I agree with Lazar's advice, but this is how you would do it. The trick is that missing numbers in Stata are higher than other numbers - so if you sort by variable x, the missing values will come last.

    Therefore you can do it in one line:
    Code: 
    bysort company (x): drop if missing(x[_N])
    x[_N] is the last value of x within each company.

  4. #4
    Points: 224, Level: 4
    Level completed: 48%, Points required for next Level: 26

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Deleting data - Panel Data (please help)


    Thanks for the reply. Great stuff.



    Quote Originally Posted by bukharin View Post
    I agree with Lazar's advice, but this is how you would do it. The trick is that missing numbers in Stata are higher than other numbers - so if you sort by variable x, the missing values will come last.

    Therefore you can do it in one line:
    Code: 
    bysort company (x): drop if missing(x[_N])
    x[_N] is the last value of x within each company.

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats