+ Reply to Thread
Results 1 to 2 of 2

Thread: Grouping & categorical question

  1. #1
    Points: 1,637, Level: 23
    Level completed: 37%, Points required for next Level: 63

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Grouping & categorical question



    I am trying to write some code to do the following, but an unable to figure it out.



    I have a list of sorted key's, and a categorical variable (Y/N) and I am trying to list the key's where the categorical variable goes from N to Y (only that way, not the other way around).



    An example is this:

    Key Variable

    1 Y

    1 N

    1 Y

    2 N

    2 Y

    3 Y

    3 Y

    3 N



    So for this dataset, I would like an output of key's 1 and 2 since those are the only ones that move from N to Y within their key.



    Any help is appreciated...thanks!

  2. #2
    TS Contributor
    Points: 6,942, Level: 54
    Level completed: 96%, Points required for next Level: 8

    Posts
    782
    Thanks
    0
    Thanked 71 Times in 70 Posts

    /* I think this'll work, maybe there's something better though... */

    *Let's say you have this data set BASE:

    Key Variable
    1 Y
    1 N
    1 Y
    2 N
    2 Y
    3 Y
    3 Y
    3 N
    ;

    *Preserve order of data;
    data base_order;
    set base;
    order=_N_;
    run;

    *The idea: if a KEY has a higher ORDER for a record where VARIABLE="Y" compared to the first record where VARIABLE="N", then you want to keep that KEY*
    *eg YYY...NNN... -> no keep
    YYY...NNN...Y -> keep, no matter how many Ns are between the first N and the subsequent Y*;

    *Keep the ORDER value for first record of VARIABLE=N*;
    proc sort data=base_order out=base_order_n;
    where variable="N";
    by key order;
    run;

    data nfirst(rename=(order=order_n));
    set base_order_n;
    if first.key;
    run;

    *Keep the ORDER value for last record of VARIABLE=Y*;
    proc sort data=base_order out=base_order_y;
    where variable="Y";
    by key DESCENDING order;
    run;

    data ylast(rename=(order=order_y));
    set base_order_y;
    if first.key;
    run;

    *Merge by KEY, to get ORDER_N and ORDER_Y, then compare values to determine whether to keep*;
    data ntoy;
    merge nfirst ylast;
    by key;
    if order_n=. then delete;
    if order_n > order_y then delete;
    run;

+ Reply to Thread

Similar Threads

  1. grouping categorical variables for regression
    By bomme in forum Statistical Research
    Replies: 0
    Last Post: 11-20-2010, 09:20 AM
  2. Categorical regression question
    By kladan in forum Regression Analysis
    Replies: 5
    Last Post: 07-13-2010, 02:47 PM
  3. Grouping Variable - Where and How... Please Help!!!
    By nogoodatstats in forum Statistics
    Replies: 1
    Last Post: 08-13-2009, 04:42 PM
  4. grouping doubt
    By singhakshay1710 in forum SAS
    Replies: 1
    Last Post: 05-14-2009, 02:00 PM
  5. Grouping populations
    By ohms_law in forum General Discussion
    Replies: 0
    Last Post: 10-11-2007, 04:18 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats