Summing number of "successes" over multiple related binary variables for a patient

ondansetron

TS Contributor
Hey everyone,

I'm trying to find a more efficient way to tell SAS to look at an observation and count the number of "successes" when looking at specific variables and then to create a new variable that is the count of "successes" for this observation.

OBS X1 X2 X3
1 1 0 0
2 1 1 1
.
.
.
n 0 1 0

I want the syntax to tell SAS to look in each row and only count the 1's for X1 and X3 while ignoring X2. I want the sum of the 1's to enter into a new variable, X4.
For example, OBS 1 X4=1, OBS2 X4=2, OBS3 X4=0.

I have been using just:

IF X1 =1 THEN X4=1;
IF X3=1 THEN X4=X4+1;

I currently have some that are in Y/N format, and I'd like to improve my skills with this. The easier solution I see is to recode Y/N as 1/0 numeric then just SUM (X1,X3) to arrive at X4.

Looking forward to some other ideas that are surely better principled and more efficient.

hlsmith

Less is more. Stay pure. Stay poor.
Your way seems fine. A approach like Dason's seems fine as well. I think I would do a summation, X1 + X2,..,Xk.

I would also add missing data once, for a quality check, so you know what to expect if there is missing data. That is always a fear that you have missing data and your algorithm doesn't notify you of it and you make assumption based on thinking you have data for everyone and every variable in the sample - when there could be missing data not at random.

ondansetron

TS Contributor
Good idea. For side learning, I was looking for an efficient way when some variables are coded as bytes with 1 or 0, for example and some are coded as characters with Y/N, for example and if I had 10 variables to look at-- would their be an efficient way to create this?

Dason

For ten variables? Probably just easiest to treat each one on a case by case basis

ondansetron

TS Contributor
I agree, but I was wondering if there is value in assuming they're discrete jumps versus some other function of the count of them (and saves df). Overall, would you see any benefit to this approach or better to treat them with dummies?

An example would be number of medical comorbidities. Out of diabetes, heart failure, an autoimmune condition, cancer, HIV, how many do they have? just an example.

hlsmith

Less is more. Stay pure. Stay poor.
Well above your X3 put a hiccup in the mix but I would usually treat comorbidities as binary and have them all have sequential var names and sum across, something like X1-X23.