+ Reply to Thread
Results 1 to 10 of 10

Thread: Nominal predictor with 70 levels....

  1. #1
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Nominal predictor with 70 levels....




    I have a nominal level predictor with 70 levels (its units that provide some service). I want to analyze how these units preform on an interval response variable (I will use linear regression) and a two level response variable (I will use logistic regression). I can of course compare one level against the other 69 70 times, but this seems less than ideal because of family wise error and because I don't know what I would really learn this way. I really want to compare every level against each other level.

    I was wondering if anyone had dealt with this type of issue before. I am trying to see how good units did relative to each other controlling for other variables. I could of course just do descriptive statistics, but I prefer not to because you really can't control for other variables with descriptives.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  2. #2
    TS Contributor
    Points: 12,287, Level: 72
    Level completed: 60%, Points required for next Level: 163
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,471
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Nominal predictor with 70 levels....

    hi,
    any chance of clustering the units first? If you could group units in dome way and use the centroids (?) that would solve the problem imo .

  3. The Following User Says Thank You to rogojel For This Useful Post:

    noetsi (12-11-2015)

  4. #3
    Human
    Points: 12,686, Level: 73
    Level completed: 59%, Points required for next Level: 164
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,363
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: Nominal predictor with 70 levels....

    Do a QQ-plot of all the estimated parameters. Those who deviates from a straight line will be "real" effects, in contrast to the randomness.

  5. The Following User Says Thank You to GretaGarbo For This Useful Post:

    noetsi (12-11-2015)

  6. #4
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Nominal predictor with 70 levels....

    I have to comment on the performance of individual units rogjel and I don't think I can with the clustering (does this mean factor analysis)?
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  7. #5
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Nominal predictor with 70 levels....

    Quote Originally Posted by GretaGarbo View Post
    Do a QQ-plot of all the estimated parameters. Those who deviates from a straight line will be "real" effects, in contrast to the randomness.
    Why is this so GretaGarbo? I have not seen this approach discussed before, do you have a citation or link I could look at on this topic?

    Are you suggesting running the k-1 dummies and then using these as parameter estimates?
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  8. #6
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: Nominal predictor with 70 levels....

    You could fit a mixed model treating the units as a random effect. In such a model you could throw in whatever covariates you like and examine the distribution of the random unit effects.
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

  9. The Following 2 Users Say Thank You to Jake For This Useful Post:

    GretaGarbo (12-12-2015), noetsi (12-15-2015)

  10. #7
    TS Contributor
    Points: 14,811, Level: 78
    Level completed: 91%, Points required for next Level: 39
    Miner's Avatar
    Location
    Greater Milwaukee area
    Posts
    1,171
    Thanks
    34
    Thanked 405 Times in 363 Posts

    Re: Nominal predictor with 70 levels....

    Quote Originally Posted by noetsi View Post
    Why is this so GretaGarbo? I have not seen this approach discussed before, do you have a citation or link I could look at on this topic?

    Are you suggesting running the k-1 dummies and then using these as parameter estimates?
    I believe that to which Greta is referring is a variant of the half normal plot analysis. See http://math.uhcl.edu/li/teach/stat55...normalplot.pdf


    Have you considered using ANOM? The null hypothesis for ANOM is that the individual mean is the same as the overall group mean. See https://cran.r-project.org/web/packa...ettes/ANOM.pdf

  11. The Following User Says Thank You to Miner For This Useful Post:

    noetsi (12-15-2015)

  12. #8
    Human
    Points: 12,686, Level: 73
    Level completed: 59%, Points required for next Level: 164
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,363
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: Nominal predictor with 70 levels....

    When I saw Jakes answer in post 6, I thought that, "yes of course, that would be a very good method" (and maybe even the "best"). You can sort of "throw in what ever .... you like" and the method will take care of it.

    (If I remember it correctly, for the James and Stein rule to be valid, the groups should be randomly selected. Maybe Noetsis 70 groups can be thought of like that. Then by James &Stein it will decrease the Mean squared error by shrinking towards the mean.)

    About the QQ-plots I was thinking of this: If you generate 700 random normal numbers and put them in 70 groups, then the mean of the 70 groups will also be normally distributed and you can have a look at it with a QQ-plot (or a pp-plot or, I believe, with a half normal plot). The 70 number will be on a straight line in the QQ-plot. Most of the random numbers will be close to the mean but some will be larger - but they will be close to a straight line in the QQ-plot. My suggestion is that "real effects" will deviate from the straight line.

    But I did not think about if the size of the groups varies. Then the variance of means will be different. I believe that I have heard of methods to correct for that, but I don't remember.

    And yes, I was thinking of using QQ-plots like they are used in 2^p factorial designs. (And I believe, but I am not sure, that the half normal plots are used just like the QQ-plots.)

  13. The Following User Says Thank You to GretaGarbo For This Useful Post:

    noetsi (12-15-2015)

  14. #9
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Nominal predictor with 70 levels....

    I have not worked with random effects outside multilevel models, but I will look at that. I have not heard of ANOM at all, but I will certainly look at that
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  15. #10
    TS Contributor
    Points: 12,287, Level: 72
    Level completed: 60%, Points required for next Level: 163
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,471
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Nominal predictor with 70 levels....


    Quote Originally Posted by noetsi View Post
    I have to comment on the performance of individual units rogjel and I don't think I can with the clustering (does this mean factor analysis)?
    In this case I think you would need to consider all of them in the model. It would not be necessary to compare each against each though, if the goal is to form some ranking groups. Something like Tukeys HSD could be useful.

    regards

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats