Quick way to sum?

#1
I'm having trouble figuring out how to pick out the pieces of data that I need to then sum those pieces. So I have 3 areas ('east', 'west', and 'north'), month, age, and sex. I want to be able to get totals for these - so for example, the total for 'east' for records that are month '1', sex '1', and age '1', '2', '3', and '4'. So I don't know if I need an if-then/else statement or need to use a sum statement somewhere...I'm a bit lost!


I also don't know if the three areas (let's say they're currently entered as 'east', 'west', and 'north') should be changed to something like '1', '2', and '3' - but I cannot figure out how to do that either. I'm not sure if the label statement needs to be used or if this is just for labeling the actual variable and not categories within the variable.

Thank you!
 
#2
What you’ve written reads like “proc univariate” will do what you want.

There are also other approaches one could use, but it’s unclear what you mean by “the total for 'east' for records that are…”. Total of what? Or do you mean how many records fall into each possible combination of Area, Month, Sex and Age?
 
#3
What you’ve written reads like “proc univariate” will do what you want.

There are also other approaches one could use, but it’s unclear what you mean by “the total for 'east' for records that are…”. Total of what? Or do you mean how many records fall into each possible combination of Area, Month, Sex and Age?
Thanks. Sorry for it being a bit unclear. Yes, I do mean the number of records that fall into whichever combination I want to specify. So, how many records fall under 'east' for month 1, sex 1, and age 1. Or how many records fall under 'west' for month 5, sex 2, and age groups 1-4. I just don't know how to specify each query to get those results and what code I should use.
 
#4
I’m still not clear about how you want this to work. Try the following code and look at the output dataset. Then tell me what you do or do not want.

Code:
proc means data = <Your input dataset's name> noprint;
	class	Area
		Month
		Sex
		Age;
	output	out = <Frequency table dataset's name>
		(where = (_TYPE_ = 15));
run;
 

noetsi

Fortran must die
#5
To me it seems like PROC SQL would do this fairly easy assuming I understand what you want and you have some variable that you are counting like an ID (which is how you normally count records).

Proc SQL;
COUNT DISTINCT (varname),area, month, sex, age
FROM tabname
GROUP BY area, month, sex, age
RUN;
QUIT;

This will show you how many rows you have broken down by those variables. If you wanted to only count say how many rows there were for a given area for a given gender you would leave out all but those variables from the SELECT and GROUP by statement.
 
#6
Thanks. I ended up using the code from con-tester and have been working on that. I recoded some of my variables into specific age groups, then that code helped to give me the summary tables that I needed to get the sum for the different groupings. Sorry it was confusing, but it worked! :)