# Thread: Grouping & categorical question

1. ## Grouping & categorical question

I am trying to write some code to do the following, but an unable to figure it out.

I have a list of sorted key's, and a categorical variable (Y/N) and I am trying to list the key's where the categorical variable goes from N to Y (only that way, not the other way around).

An example is this:

Key Variable

1 Y

1 N

1 Y

2 N

2 Y

3 Y

3 Y

3 N

So for this dataset, I would like an output of key's 1 and 2 since those are the only ones that move from N to Y within their key.

Any help is appreciated...thanks!

2. /* I think this'll work, maybe there's something better though... */

*Let's say you have this data set BASE:

Key Variable
1 Y
1 N
1 Y
2 N
2 Y
3 Y
3 Y
3 N
;

*Preserve order of data;
data base_order;
set base;
order=_N_;
run;

*The idea: if a KEY has a higher ORDER for a record where VARIABLE="Y" compared to the first record where VARIABLE="N", then you want to keep that KEY*
*eg YYY...NNN... -> no keep
YYY...NNN...Y -> keep, no matter how many Ns are between the first N and the subsequent Y*;

*Keep the ORDER value for first record of VARIABLE=N*;
proc sort data=base_order out=base_order_n;
where variable="N";
by key order;
run;

data nfirst(rename=(order=order_n));
set base_order_n;
if first.key;
run;

*Keep the ORDER value for last record of VARIABLE=Y*;
proc sort data=base_order out=base_order_y;
where variable="Y";
by key DESCENDING order;
run;

data ylast(rename=(order=order_y));
set base_order_y;
if first.key;
run;

*Merge by KEY, to get ORDER_N and ORDER_Y, then compare values to determine whether to keep*;
data ntoy;
merge nfirst ylast;
by key;
if order_n=. then delete;
if order_n > order_y then delete;
run;

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts