Help: How to write a SAS program to count births over multiple variables?

#1
Dear almighty statistical masters,

I would be infinitely grateful for some assistance with the following problem:

I use SAS (although is by no means an expert), but an institution mainly using SAS has been generous enough to grant me access to a large database I would like to analyze. One of the potential exposures that may lead to the disease I am studying, is multi-parity (more than one pregnancy). I would thus like to figure out how many births (live or stillbirths) a women has had prior to her current pregnancy.

In the questionnaire each birth is recorded as a line where you can fill out information regarding that pregnancy. I am only interested in two of the variables, namely "Outcome" and "Length". "Outcome" records the outcome of the pregnancy ended (1=Live birth, 2=Stillbirth, 3=Provoked abortion, 4=Ectopic pregnancy, etc), and length records the number of gestational weeks. There is one variable for each pregnancy, i.e. the outcomes for potential previous pregnancies in one women are contained in the variables AA95 AA101 AA107 AA113 AA119 AA125 AA131 AA137 AA143 AA149. Corresponding lengths are stored in AA96 AA102 AA108 AA114 AA120 AA126 AA132 AA138 AA144 AA150.

Now, I would like to sum up how many pregancies a women has had before her current one. Some of the types of pregnancies only matter if they have proceeded beyond 12 weeks of gestation, i.e. LEN>12, so it would be nice to have a qualifier for these. How do I do this in SAS?

A corresponding program for SPSS looks like this:


COMPUTE PARI_MB=0.

DO REPEAT OUT = AA95 AA101 AA107 AA113 AA119 AA125 AA131 AA137 AA143 AA149
/ LEN = AA96 AA102 AA108 AA114 AA120 AA126 AA132 AA138 AA144 AA150.
IF (OUT=1 OR OUT=5 OR ((OUT=2 OR OUT=3 OR OUT=4 OR OUT=6 OR OUT=7) AND LEN>12) ) PARI_MB=PARI_MB+1.
END REPEAT.

EXECUTE.

FREQUENCIES
VARIABLES=PARI_MB
/ORDER= ANALYSIS


I tried rewriting it, but ran into a wall. Would be amazingly grateful for help, and even more so if you could interspace with comments so I learned how to do it by myself next time, lest to mention that I may be able to help the next guy at the forum with a similar problem :)

Have a great week!

All the best,
petkiri
 

jrai

New Member
#2
This is the first thing that came to my mind. There might be some more efficient ways to do it:
Break data set into two sets, one with outcome & other with length:
data outc(drop=AA96 AA102 AA108 AA114 AA120 AA126 AA132 AA138 AA144 AA150) leng(drop=AA95 AA101 AA107 AA113 AA119 AA125 AA131 AA137 AA143 AA149);
set main_file_name;
run;

Now put the rows into columns for both the datasets. Let's call the variable containing primary key as subject.
proc transpose data=outc out=outc1(rename=(col1=outcome));
var AA95 AA101 AA107 AA113 AA119 AA125 AA131 AA137 AA143 AA149;
by subject;
run;

proc transpose data=leng out=leng1(rename=(col1=length));
var AA96 AA102 AA108 AA114 AA120 AA126 AA132 AA138 AA144 AA150;
by subject;
run;

Now merge them together:
data final;
set outc1(drop=_name_);
set leng1(drop=subject _name_);
run;

This will give you a dataset that gives data in following structure:
subject outcome length
1 1 5
1 2 5
1 3 12
2 2 2
2 3 4

The idea is to essentially transform columns to rows.
Now the rows can be simply counted by following code:
proc sql;
select distinct subject, count(outcome) as count
from final(where=(oucome in (1,5) or (outcome in (2,3,4,6,7) and length<12)))
group by 1;
quit;

As I said there can be shorter ways of doing it. I'll possibly give one or more code in the morning.

Hope it helps.
 

jrai

New Member
#3
Ok here is the simpler logic using array statement. This converts your SPSS code to the SAS code:

data ;
set ;
array a{*} AA95 AA101 AA107 AA113 AA119 AA125 AA131 AA137 AA143 AA149;
arrat b{*} AA96 AA102 AA108 AA114 AA120 AA126 AA132 AA138 AA144 AA150;
PARI_MB=0;
do i=1 to dim(a);
if a{i} in (1,5) or (a{i} in (2,3,4,6,7) and b{i}<12) then PARI_MB=PARI_MB+1;
end;
run;