+ Reply to Thread
Results 1 to 6 of 6

Thread: best way to extract selected variables from a large dataset?

  1. #1
    Points: 3,774, Level: 38
    Level completed: 83%, Points required for next Level: 26

    Location
    Currently Southampton, UK
    Posts
    40
    Thanks
    8
    Thanked 0 Times in 0 Posts

    best way to extract selected variables from a large dataset?



    Hi All,

    I was wondering what would be the best way to extract selected variables from a large dataset (DHS data). I will be working with 40-50 datsets, from which I will need to extract 10-20 variables and merge those into a new dataset.

    Kindly advise how to avoid copy - paste.

    Thank you,
    S.

  2. #2
    Points: 811, Level: 15
    Level completed: 11%, Points required for next Level: 89

    Posts
    78
    Thanks
    0
    Thanked 24 Times in 23 Posts

    Re: best way to extract selected variables from a large dataset?

    You could define a local macro for the variable names and use -keep- and -save-.
    Then you could use -merge- or -append- (depending on the nature of your data) to combine them into a new dataset. If the 40-50 data sets are correlated, you could use -merge 1:1 _n-.


    Quote Originally Posted by SylviaS View Post
    Hi All,

    I was wondering what would be the best way to extract selected variables from a large dataset (DHS data). I will be working with 40-50 datsets, from which I will need to extract 10-20 variables and merge those into a new dataset.

    Kindly advise how to avoid copy - paste.

    Thank you,
    S.

  3. The Following User Says Thank You to wangwang For This Useful Post:

    SylviaS (09-07-2012)

  4. #3
    Points: 1,838, Level: 25
    Level completed: 38%, Points required for next Level: 62

    Location
    UK
    Posts
    163
    Thanks
    0
    Thanked 9 Times in 9 Posts

    Re: best way to extract selected variables from a large dataset?

    To save a bit of time, in addition to what wangwang has suggested if you are bringing them in from another location like excel you can use -insheet- and specify the variables you want. If its always the same variable names you can loop this along with the merge. If the data are already in Stata format you wont need to do that, but I couldn't tell from your post what was the case.

    On a kind of random note, can merge be used directly with non-Stata files? I've never thoguht about trying it, so maybe I am being really inefficient!

  5. The Following User Says Thank You to duskstar For This Useful Post:

    SylviaS (09-07-2012)

  6. #4
    Points: 3,774, Level: 38
    Level completed: 83%, Points required for next Level: 26

    Location
    Currently Southampton, UK
    Posts
    40
    Thanks
    8
    Thanked 0 Times in 0 Posts

    Re: best way to extract selected variables from a large dataset?

    Thank you both. I am not so advanced as to create a local macro

    The datasets are all in stata format already and normally all variables should be standardised (to double check). I was wondering whether there is any stata command which would allow to extract variables into a new dataset....

    Any time efficient and simple solutions?

  7. #5
    Points: 1,838, Level: 25
    Level completed: 38%, Points required for next Level: 62

    Location
    UK
    Posts
    163
    Thanks
    0
    Thanked 9 Times in 9 Posts

    Re: best way to extract selected variables from a large dataset?

    When you say extract what do you mean? Do you mean export to some other program?

    Like wangwang said before, you use -keep- to specify which variables you want (if you want to keep it in Stata format) and then save. If you want to take it into another program you can use something like -outsheet-. Do you want to put everything into one file in which case you can use merge or append? If the problem is writing the loop we can probably provide examples, once we know a little more.

    I'm sorry I don't think I have understood your question so I don't think this is helpful at all.

  8. #6
    RoboStataRaptor
    Points: 7,394, Level: 57
    Level completed: 22%, Points required for next Level: 156
    bukharin's Avatar
    Location
    Sydney, Australia
    Posts
    1,024
    Thanks
    9
    Thanked 243 Times in 236 Posts

    Re: best way to extract selected variables from a large dataset?


    I agree with duskstar. It is very unclear what you're actually trying to do. I suggest providing a "before & after" example.

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats