SAS programming help needed please!

#1
Hi,

I am writing a SAS program that has too many IF THEN statements and it is getting overwhelmingly complicated. I am working with ages and placement of number of kids into "age buckets". Can you please take a look and try to figure out how to help with this (maybe with an array or a loop?).

Variables:

Ager (age at interview time - coded: 15-44)
Howmany (how many kids do you have - coded: 1-9)
Fathers_age_kid1
Fathers_age_kid2
Fathers_age_kid3
Fathers_age_kid4
Fathers_age_kid5
Fathers_age_kid6
Fathers_age_kid7
Fathers_age_kid8
Fathers_age_kid9

Need data to output into:
Bucket1 (did father have kid between 15-19? If yes, drop here)
Bucket2 (did father have kid between 20-24? If yes, drop here - and if there is a number in the previous bucket, add 1 to the previous number)
Bucket3 (did father have kid between 25-29? If yes, drop here - and if there is a number in the previous buckets, add 1 to the previous number)
Bucket4 (did father have kid between 30-34? If yes, drop here - and if there is a number in the previous buckets, add 1 to the previous number)
Bucket5 (did father have kid between 35-39? If yes, drop here - and if there is a number in the previous buckets, add 1 to the previous number)
Bucket6 (did father have kid between 40-44? If yes, drop here - and if there is a number in the previous buckets, add 1 to the previous number)

So, each bucket with ascending age groups will keep a count of how many kids at what age. There are missing values involved in each father's age of kid(s) and that is throwing my program off. Plus, there are multiple births (twins, triplets, quads, etc.) or kids born a year or two after the previous one which will not increment the same bucket N.

Thanks very much for your help with this.
 

geosun

New Member
#2
this is kind of confusing can you put a small sample of the data you have and the output you want?
it seems like you need to use a macro variable to run through all the columns and change them into an age group to be later categorized and summed up to do this in an efficient maner
 

geosun

New Member
#3
i'm guessing something like this?
Code:
%macro create(i);
	%do n=1 %to &i;
		data ager&n.;
			set ager;
			if Fathers_age_kid&n.>15 and Fathers_age_kid&n.<19 then Bucket&n.=1;
			if Fathers_age_kid&n.>20 and Fathers_age_kid&n.<24 then Bucket&n.=2;
			if Fathers_age_kid&n.>25 and Fathers_age_kid&n.<29 then Bucket&n.=3;
			if Fathers_age_kid&n.>25 and Fathers_age_kid&n.<29 then Bucket&n.=4;
			if Fathers_age_kid&n.>30 and Fathers_age_kid&n.<34 then Bucket&n.=5;
			if Fathers_age_kid&n.>35 and Fathers_age_kid&n.<39 then Bucket&n.=6;
			if Fathers_age_kid&n.>40 and Fathers_age_kid&n.<44 then Bucket&n.=7;
		run;
		data ager&n.;
			set ager&n.(keep=Bucket&n.);
			rename Bucket&n.=Bucketall;
		run;
	%end;
%mend create;

%create(9)

data agerall;
	set ager1 ager2 ager3 ager4 ager5 ager6 ager7 ager8 ager9;
	if Bucketall=. then delete;
run;
proc freq data=agerall;
	tables Bucketall / norow nocol;
run;
 
#4
Geosun,
The created dataset (with the incorrect Bucketn vars is located here -- please drop those vars when running your program).

http://wikisend.com/download/143606/age_kids_buckets.sas7bdat

A case to use for testing from this dataset is the one where someone has 9 kids (and there is only one case like this):

What I need is this:

Age of father=44
Number of kids=9
Age at 1st kid=20 (code should drop a "1" in Bucket2)
Age at 2nd kid=22 (code should drop a "1" in Bucket2 and increment the N to "2")
Age at 3rd kid=24 (code should drop a "1" in Bucket2 and increment the N to "3")
Age at 4th kid=29 (code should drop a "1" in Bucket3 and increment the N to "4")
Age at 5th kid=31 (code should drop a "1" in Bucket4 and increment the N to "5")
Age at 6th kid=36 (code should drop a "1" in Bucket5 and increment the N to "6")
Age at 7th kid=37 (code should drop a "1" in Bucket5 and increment the N to "7")
Age at 8th kid=41 (code should drop a "1" in Bucket6 and increment the N to "8")
Age at 9th kid=43 (code should drop a "1" in Bucket6 and increment the N to "9")

The complicated part for me is the increasing N in each bucket indicating the total number of kids over time in each age group. As you will see in the data, the next problem with these data are missing values at any of the age groups in which I could shift data toward the left if there is missing values in any of the fa_bioageN vars.

Thank you again for your help.
 

geosun

New Member
#5
i'm not sure why you need to increment the N? you already have the number of kids
but here's where i'm at right now :
i still need to find a way to sum each column by the key
Code:
data ager;
	set Tmp1.Age_kids_buckets;
	n=1;
run;
data ager(keep=key ager fa_bioage1 fa_bioage2 fa_bioage3 fa_bioage4 fa_bioage5 fa_bioage6 fa_bioage7 fa_bioage8 fa_bioage9);
	set ager;
	by n;
	if first.n then Key=0;
	key+n;
	output;
run;
/* lets split the specials*/
data agera;
	set ager;
	if fa_bioage1=.;
run;
data ager
	set ager;
	if fa_bioage1 ne .;
run;
proc transpose data=ager out=agernew;
by key;
var fa_bioage1 fa_bioage2 fa_bioage3 fa_bioage4 fa_bioage5 fa_bioage6 fa_bioage7 fa_bioage8 fa_bioage9;
run;
data agernew1;
	set agernew;
	if col1=. then delete;
	if col1>15 and col1 le 19 then bucket1=1;
	if col1>19 and col1 le 24 then bucket2=1;
	if col1>24 and col1 le 29 then bucket3=1;
	if col1>29 and col1 le 34 then bucket4=1;
	if col1>34 and col1 le 39 then bucket5=1;
	if col1>39 and col1 le 44 then bucket6=1;
run;
 
#6
The reason for summing as we move along is because they want to know the total number of kids at each time interval.

Thanks for your help!
 

geosun

New Member
#7
then all you need to do is add to the end
Code:
data agernew1;
	set agernew1;
	by key;
	if first.key then n=0;
	n+1;
	output;
	if last.key;
run;
 

Mean Joe

TS Contributor
#8
What I need is this:

Age of father=44
Number of kids=9
Age at 1st kid=20 (code should drop a "1" in Bucket2)
Age at 2nd kid=22 (code should drop a "1" in Bucket2 and increment the N to "2")
Age at 3rd kid=24 (code should drop a "1" in Bucket2 and increment the N to "3")
Age at 4th kid=29 (code should drop a "1" in Bucket3 and increment the N to "4")
Age at 5th kid=31 (code should drop a "1" in Bucket4 and increment the N to "5")
Age at 6th kid=36 (code should drop a "1" in Bucket5 and increment the N to "6")
Age at 7th kid=37 (code should drop a "1" in Bucket5 and increment the N to "7")
Age at 8th kid=41 (code should drop a "1" in Bucket6 and increment the N to "8")
Age at 9th kid=43 (code should drop a "1" in Bucket6 and increment the N to "9")
Is this right? Or do you want this:

Age at 1st kid=20 (code should drop a "1" in Bucket2)
Age at 2nd kid=22 (code should drop a "1" in Bucket2 and increment the N to "2" for Bucket2)
Age at 3rd kid=24 (code should drop a "1" in Bucket2 and increment the N to "3" for Bucket2)
Age at 4th kid=29 (code should drop a "1" in Bucket3 and initialize the N to "1" for Bucket3)
Age at 5th kid=31 (code should drop a "1" in Bucket4 and initialize the N to "1" for Bucket4)
Age at 6th kid=36 (code should drop a "1" in Bucket5 and initialize the N to "1" for Bucket5)
Age at 7th kid=37 (code should drop a "1" in Bucket5 and increment the N to "2" for Bucket5)
Age at 8th kid=41 (code should drop a "1" in Bucket6 and initialize the N to "1" for Bucket6)
Age at 9th kid=43 (code should drop a "1" in Bucket6 and increment the N to "2" for Bucket6)
 
#9
Mean Joe, no it should increment like this: B1=B1, B2=B1+B2, B3=B1+B2+B3, B4=B1+B2+B3+B4, B5=B1+B2+B3+B4+B5, B6=B1+B2+B3+B4+B5+B6. The end bucket, dependent on age of father at interview, should have the total amount of kids (HOWMANY) as a check that it was done correctly.

Geosun, your program file ran OK but it did not provide the increments to each bucket nor did it place the columns of data in a horizontal format. This is what the above example should look like via a simple proc print:

--
AGER FA_BIO1 FA_BIO2 FA_BIO3 FA_BIO4 FA_BIO5 FA_BIO6 FA_BIO7 FA_BIO8 FA_BIO9 B1 B2 B3 B4 B5 B6
44 20 22 24 29 31 36 37 41 43 0 3 4 5 7 9
--

Thanks to both of you for your help with this!
 
#10
the buckets need to be in a vertical form to increment it easily
after that just use a proc sql match merge the original key to get it back into a horizontal form