Do I need this variable in the class statement? And IRRs

#1
I am trying to calculate incidence rate ratios and 95% CIs for the interaction between two variables: area and time. My variables are n, area, period, and age (using log of population as the offset). I have two questions that tie together for the code I am using, which is:

proc genmod data=inf;
(where age=1.....depends if I need to specify age group or not. If I want all ages, then I just leave this statement out)
class area age;
model n=area period area*period / dist=nb link=log offset=logpop type3;
estimate "Area 1 v area 2*period" area*period 1 -1 0/exp e;
estimate "Area 2 v area 3*period" area*period 0 -1 1/exp e;
estimate "Area 1 v area 3*period" area*period 1 0 -1/exp e;
run;

Q1 - I have 12 different age groups. If I include 'age' in the class statement, I get slightly different IRRs and certainly different p-values than if I do not include 'age' in the class statement. Should I be including age in the class statement for the model? There are times where I want to look at everything together (all-ages) and other times where I need to specify age groups that I want to look at, so I don't know if 'age' is needed in the class statement for each model or not.

Q2 - I'm not totally sure how to be interpreting the output for the IRs and CIs. I understand these (e.g., how to interpret IRR of something like 1.3), but it isn't seeming to make sense when I'm looking at the output. For example, with the above code (NO age listed in class statement), I get the following:

Area 1 v 2: IRR (1.0007) 95% CI (0.9999-1.0015) p-value 0.1043
Area 2 v 3: IRR (0.9985) 95% CI (0.9977-0.9994) p-value 0.0007
Area 1 v 3: IRR (1.0022) 95% CI (1.0015-1.0028) p-value <0.0001

This doesn't quite make sense looking at the p-value that goes with the IRRs and CIs. However, when I DO include age in the class statement for the same dataset, I get:

Area 1 v 2: IRR (1.0006) 95% CI (0.9990-1.0022) p-value 0.4867
Area 2 v 3: IRR (0.9996) 95% CI (0.9979-1.0013) p-value 0.6697
Area 1 v 3: IRR (1.0009) 95% CI (0.9994-1.0025) p-value 0.2411

This seems to make more sense to me, as looking at the IRR and CI I wouldn't expect significant results. So does this mean that age should be included in the class statement? Thanks!
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
Sorry I just saw this post. I will stare at it more early next week if I have time but I think you can include it in the class and not as long as you are content with your justifications since each model is saying something else. However if you are running a bunch of pseudo nested models you may need to correct for false discovery.

Your latter question may be that the pvalue changes because as a categorical variable you are losing more degrees of freedom?