Proc HPREG class statement

noetsi

Fortran must die
#1
proc hpreg data=WORK.REG;
CLASS '2FL'n '10FL'n / PARAM=REFERENCE REF=Last;

So if you have 0 and 1 is 0 last or 1.

It appears that this only allows last or first. You can not set ref=1 or 0. I am not sure what ordered last or first means.
 
Last edited:

noetsi

Fortran must die
#2
I tried this in proc genmode and failed there to.

Code:
CLASS "10FL"n (REF =1) "2FL"n (ref=1)
;
generates an error. It says it will take only last or first. I thought you could set reference coding to any valid level of the variables.
 

hlsmith

Less is more. Stay pure. Stay poor.
#3
If you dont know which one it is using just look at the output, it will tell you. I am pretty sure you can set ref in genmod or at least in its estimate statement!
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
Example of a piece of code I am using right now for a logistic, I think you just change dist to normal and link to identity or gaussian and you should be good to go:

Code:
proc genmod data = LWBS.full descending ;
    class
       RaceCat (ref='white')
       CCcat (ref='Abdom')
       HoursCat (ref='OpWeekdayHours');
    model LWBS =
       age_v
       racecat
       CCcat
       HoursCat
       ED_Accuity_V
       Pulse_V
       Respirations_V
       Pulse_Ox_V
       Pain_Score_V
       /link=logit dist = binomial alpha=0.01;
    estimate 'Age' age_v 10/ alpha=0.01 exp;
    estimate 'Asian' RaceCat 1 0 0 0 0 -1 / alpha=0.01 exp;
run;
 
Last edited:

noetsi

Fortran must die
#5
It looks to me like it just takes first and last but maybe that is because I am trying to do it in the global class statement. If ref=1 (a number)
do you do (ref=1) or (ref='1')
 

noetsi

Fortran must die
#6
when I run this code

Code:
PROC GENMOD DATA=WORK.SORTTempTableSorted
PLOTS(ONLY)=ALL
;
CLASS "10FL"n (REF =1) "2FL"n (ref=1)
;
MODEL Q2Wage="2FL"n "10FL"n
/
;
I get this error

Code:
ERROR 22-322: Syntax error, expecting one of the following: a quoted string, FIRST, LAST.
ERROR 200-322: The symbol is not recognized and will be ignored.
It turns out you have to do this which makes no sense since it is a number field not a string

Code:
PROC GENMOD DATA=WORK.SORTTempTableSorted
PLOTS(ONLY)=ALL
;
CLASS "10FL"n (ref ='1') "2FL"n (ref='1')
;
MODEL Q2Wage="2FL"n "10FL"n
/
;
Not having used genmode before I found the output confusing. Is this what it should look like?

Parameter DF Estimate Error Limits Chi-Square Pr > ChiSq
Intercept 1 1686.360 26.9170 1633.603 1739.116 3925.05 <.0001
2FL 0 1 239.0933 57.7919 125.8233 352.3633 17.12 <.0001
2FL 1 0 0.0000 0.0000 0.0000 0.0000 . .
10FL 0 1 311.3500 51.7718 209.8790 412.8210 36.17 <.0001
10FL 1 0 0.0000 0.0000 0.0000 0.0000 . .
Scale 1 3189.663 15.6405 3159.155 3220.465

So for example for variable 2FL the non-reference dummy has a mean 239 larger
 

hlsmith

Less is more. Stay pure. Stay poor.
#7
Generalized models typically use a version of maximum likelihood and allow you to state the distribution and link function. So I can use them to move between ORs, RRs, and RDs. There can be convergence or appropriateness issues in the latter groups, but the log will tell you this.

Without seeing the rest of the output, it seems 2FL=1 has an expected mean 1686 and 2FL=0 has an expected 239 mean increase while controlling for 10FL.

What is up with the "n used with the variable name? I haven't seen that before.
 

noetsi

Fortran must die
#8
I don't understand that hlsmith. The code makes the reference level 1 so shouldn't level 0 be equal to the intercept not 1. I thought that is how reference levels worked.

But I think what you wrote is correct based on other data I ran. It just seems strange to me for dummy variables to be done that way.
 

hlsmith

Less is more. Stay pure. Stay poor.
#9
In a model with 1 binary predictor the reference level is the intercept and the estimate is for the other level. Though, you get to set which level is the reference, so it can be whichever you select!
 

noetsi

Fortran must die
#10
To be clear I have data that takes on values of 0 and 1 in the data. This is a portion of the proc genmode results.

2FL 0 239.0933
2FL 1 0.0000

Does this mean for this variable that those who take on a level 0 in the raw data are on average (and controlling for other variables) 239 higher than level 1. It is the mean difference between levels I am concerned with.
 
Last edited: