View Full Version : SAS code to compare cell entries?


uihiba
04-07-2009, 12:22 PM
I have a data set with 2 columns that I need to compare for similarities. In the first column, I have a set of codes separated by commas and in the second column, I have another set of codes separated by commas.

So for example, in the first column, say, I have the following: 100, 101, 102 (all 3 in one cell) and in the second column I have 105, 101, 107, 109, 110 (all in one cell).

What I need for SAS to do is compare the two cells and tell me which number repeats so for the above example, it would be 101.

I tried proc compare but it keeps truncating the entries in each cell.

The entries are defined as characters if that makes a difference.

FYI: I'm not an expert programmer so I would appreciate a detailed/clear response. Thanks! :)

*Please don't suggest another software/language to tackle this. I HAVE to do this in SAS.*

uihiba
04-08-2009, 02:05 PM
could I maybe use an array? If so, how would I declare the variables indicating they're separated by commas?

sasguru
04-09-2009, 12:34 PM
Sorry i am not giving you the exact code but will give you the logic!!!

using the scan function Parse the first variable in a Do loop and within the do loop use another scan function and parse the second variable. inside the second Do loop compare the first parsed value to the second parsed value and if they are same display it or else continue...

Hope its not greek and latin, let me know if you need more help

uihiba
04-10-2009, 10:24 PM
hi sas guru,

following ur advice - i typed the code as follows:

do i=1 to #;
do j=1 to #;
one= scan(first var,i,',');
two=scan(second var,j,',');
if one=two then equal='yes';
else equal='no';
end;
end;
run;

i got all "no's" and i know this isn't right. what am i doing wrong?

sasguru
04-13-2009, 01:12 PM
data test;

first="10,20,30,40";
second="50,80,30,90";

do i=1 to 4;
one= scan(first,i,',');
do j=1 to 4;
two=scan(second,j,',');
if one=two then do; equal='yes'; output; end;
else do; equal='no'; output; end;
end;
end;
run;

Hope this helps. Please drop all the variables you don't need from the output dataset

uihiba
04-14-2009, 12:37 PM
worked great! thanks! my supervisor suggester I try the index/indexc functions but those didn't do all of what I needed. :)