+ Reply to Thread
Results 1 to 4 of 4

Thread: calculation of the average leverage when predictor(s) is categorical

  1. #1
    TS Contributor
    Points: 40,621, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Downloads
    gianmarco's Avatar
    Location
    Italy
    Posts
    1,368
    Thanks
    232
    Thanked 301 Times in 225 Posts

    calculation of the average leverage when predictor(s) is categorical




    Hello,
    I was reading different resources about regression diagnostic, in particular for Logistic Regression.
    As for leverage, the sources suggest to seek for observations with higher-than-average leverage.

    Now, where I am confused is about how the mean leverage is calculated.
    One sources suggests: (k+1)/N
    where k=number of predictors, N=sample size

    My question:
    1) if one of the predictors is categorical, in k do we have to also count the levels of the categorical predictor?
    2) do we have to also count the intercept (I think not)?

    As for a practical example, given the dataset and the model below, how would you calculate the average leverage?
    Code: 
    mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
    mydata$rank <- factor(mydata$rank)
    
    > head(mydata)
      admit gre  gpa rank
    1     0 380 3.61    3
    2     1 660 3.67    3
    3     1 800 4.00    1
    4     1 640 3.19    4
    5     0 520 2.93    4
    6     1 760 3.00    2
    
    fit <- glm(admit ~ gre + gpa + rank, data=mydata, family=binomial(logit))
    What I am wondering is if, when counting the number of predictors (i.e., devising k), do we have to also count the number of levels of categorical predictors?
    In other words, if we have 1 continuous predictor and 1 categorical predictor with 3 levels, k would be:
    2 (i.e., 1 continous predictor + 1 categorical predictor)
    or
    3 (i.e., 1 continuous predictor + 2 [i.e., the levels of the categ predictor minus one due to dummy coding]) ?

    Thanks for any clarification
    gm
    http://cainarchaeology.weebly.com/

  2. #2
    TS Contributor
    Points: 40,621, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Downloads
    gianmarco's Avatar
    Location
    Italy
    Posts
    1,368
    Thanks
    232
    Thanked 301 Times in 225 Posts

    Re: calculation of the average leverage when predictor(s) is categorical

    I edited the second part of the question to make it (perhaps) more clear....
    http://cainarchaeology.weebly.com/

  3. #3
    Omega Contributor
    Points: 38,334, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: calculation of the average leverage when predictor(s) is categorical

    "leverage Measures the potential impact of an individual case on the results, which is directly proportional to how far an individual case is from the centroid in the space of the predictors. Leverage is computed as the diagonal elements, h sub ii , of the "Hat" matrix, bold H ,
    bold H = bold X star ( bold X star prime bold X star ) sup -1 bold X star prime
    where bold X star = bold V sup 1/2 bold X , and bold V = diag { P Hat ( 1 - P Hat ) } . As in OLS, leverage values are between 0 and 1, and a leverage value, h sub ii > 2 k / n is considered "large"; k = number of predictors, n = number of cases."

    Taken from: http://www.datavis.ca/courses/grcat/grc6.html


    I would say per my opinion, you would not include the intercept in the count and yes account for >/= 3 group categories. So TS status (human, bot, raptor) would count as 2 predictors. Still regularly using SAS, so feel free to post R code for my edification.
    Stop cowardice, ban guns!

  4. The Following User Says Thank You to hlsmith For This Useful Post:

    gianmarco (09-30-2016)

  5. #4
    Points: 134, Level: 2
    Level completed: 68%, Points required for next Level: 16

    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: calculation of the average leverage when predictor(s) is categorical


    I was just looking here and there. this thread looked interested and it is. informative. nice post

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats