Degrees of Freedom for Random Effects

ledzep

Point Mass at Zero
#1
Hi,

I am looking at the example which appears in this R page:
http://stat.ethz.ch/R-manual/R-devel/library/survival/html/frailty.html

Code:
require(survival);
# Random institutional effect
# Using Method=df fixes the degrees of freedom for the random effect in the model.

coxph(Surv(time, status) ~ age + frailty(inst, df=4), lung)


# Output
Call:
coxph(formula = Surv(time, status) ~ age + frailty(inst, df = 4), 
    data = lung)

                      coef   se(coef) se2     Chisq DF   p    
age                   0.0194 0.00933  0.00925 4.31  1.00 0.038
frailty(inst, df = 4)                         3.33  3.99 0.500

Iterations: 3 outer, 10 Newton-Raphson
     Variance of random effect= 0.038   I-likelihood = -743.6 
Degrees of freedom for terms= 1 4 
Likelihood ratio test=9.96  on 4.97 df, p=0.075
  n=227 (1 observation deleted due to missingness
Here, in this particular example they fixed the degrees of freedom for the random institution effects to be 4.
But, how did they decide?

This clearly can't be the number of institutions as there are 19 institutions.
> length(unique(lung$inst))
[1] 19

But, how to decide what degrees of freedom to specify for your random institution effect? What was the reason they chose df=4?

I would have fixed the df=1, because I would assume the institutions to come from a common distribution (i.e. gamma) and I would loose a degree of freedom for estimating the variance of this random effect.

Is this df actually the degrees of freedom for random effects, or is it something like degrees of freedom fro smoothing splines?

Please join in the discussion and share your thoughts.
 

ledzep

Point Mass at Zero
#2
In case someone happens to come this thread (future searchs)..

I cross-posted this problem over to R Mailing list, which caught the attention of Dr. Terry Therneau (the writer of two brilliant R packages survival and coxme)

There seems to be no justification of the use of 4 degrees of freedom for the random institution effect, apart from ease of interpretation.

His reply[via email]

Code:
In that particular example the value of "4" was pulled out of the air. 
There is no particular justification.

There is a strong relationship between the "effective" degrees of 
freedom and the variance of the random effect, and I often find the df 
scale easier to interpret. See the Hodges and Sargent paper in 
Biometrika (2001) for a nice explanation of the connection in linear models.

Terry T.
 

Link

Ninja say what!?!
#3
FYI: One common method I see used when one is trying to decide on the value of a parameter for a model is to use CV and to choose the value based off a Loss function. I would suggest the same be done here if you are unsure of which value to pick for the df.
 

ledzep

Point Mass at Zero
#4
Hey Link, Thank you for your reply. I am not familiar with this method. Can you possibly explain a bit more or kindly direct me to a similar example?
 

Link

Ninja say what!?!
#5
Hey Link, Thank you for your reply. I am not familiar with this method. Can you possibly explain a bit more or kindly direct me to a similar example?
Sure. The method is quite easy. Here are the general steps:
Code:
1) break the data into K folds.
2) Choose the range of the DF that you would like to test.
3) With the first value of DF that you would like to test:
    i) In a loop from i=1 to k, fit the K-1 folds leaving the ith fold out.
    ii) Use the estimated model to predict the values for the ith fold. 
    iii) Choose a loss function (e.g. L2 loss, which is just the mean squared error) and calculate the loss (e.g. loss = sqrt((predicted - actual)^2)).
    iv) take the mean of the loss over all the observations.  This is normally called the CV Risk. 
    v) now you have the CV risk for the DF value
4) do this for all the values of DF that you are testing
5) choose the DF that gave you the lowest CV risk.
This can be done using the optim function in R, though I normally do it in a loop so that I can have all the risk estimates to look at.


I did a quick google search and here are some articles:
http://cw.felk.cvut.cz/lib/exe/fetch.php/courses/ae4m33sad/13_tutorial.pdf
ftp://reports.stanford.edu/pub/cstr/reports/cs/tn/94/12/CS-TN-94-12.pdf

I hope that helps.

PS. Sorry for the delay responding. I find myself more and more busy these days.
 
#6
Hi,
can I ask if you had the time to use the method described above in the example from the lung dataset? or if you had the time to use it in a survival analysis dataset?
thanks

Panas