hi,
any chance of clustering the units first? If you could group units in dome way and use the centroids (?) that would solve the problem imo .
I have a nominal level predictor with 70 levels (its units that provide some service). I want to analyze how these units preform on an interval response variable (I will use linear regression) and a two level response variable (I will use logistic regression). I can of course compare one level against the other 69 70 times, but this seems less than ideal because of family wise error and because I don't know what I would really learn this way. I really want to compare every level against each other level.
I was wondering if anyone had dealt with this type of issue before. I am trying to see how good units did relative to each other controlling for other variables. I could of course just do descriptive statistics, but I prefer not to because you really can't control for other variables with descriptives.
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
hi,
any chance of clustering the units first? If you could group units in dome way and use the centroids (?) that would solve the problem imo .
noetsi (12-11-2015)
Do a QQ-plot of all the estimated parameters. Those who deviates from a straight line will be "real" effects, in contrast to the randomness.
noetsi (12-11-2015)
I have to comment on the performance of individual units rogjel and I don't think I can with the clustering (does this mean factor analysis)?
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
You could fit a mixed model treating the units as a random effect. In such a model you could throw in whatever covariates you like and examine the distribution of the random unit effects.
In God we trust. All others must bring data.
~W. Edwards Deming
GretaGarbo (12-12-2015), noetsi (12-15-2015)
I believe that to which Greta is referring is a variant of the half normal plot analysis. See http://math.uhcl.edu/li/teach/stat55...normalplot.pdf
Have you considered using ANOM? The null hypothesis for ANOM is that the individual mean is the same as the overall group mean. See https://cran.r-project.org/web/packa...ettes/ANOM.pdf
noetsi (12-15-2015)
When I saw Jakes answer in post 6, I thought that, "yes of course, that would be a very good method" (and maybe even the "best"). You can sort of "throw in what ever .... you like" and the method will take care of it.
(If I remember it correctly, for the James and Stein rule to be valid, the groups should be randomly selected. Maybe Noetsis 70 groups can be thought of like that. Then by James &Stein it will decrease the Mean squared error by shrinking towards the mean.)
About the QQ-plots I was thinking of this: If you generate 700 random normal numbers and put them in 70 groups, then the mean of the 70 groups will also be normally distributed and you can have a look at it with a QQ-plot (or a pp-plot or, I believe, with a half normal plot). The 70 number will be on a straight line in the QQ-plot. Most of the random numbers will be close to the mean but some will be larger - but they will be close to a straight line in the QQ-plot. My suggestion is that "real effects" will deviate from the straight line.
But I did not think about if the size of the groups varies. Then the variance of means will be different. I believe that I have heard of methods to correct for that, but I don't remember.
And yes, I was thinking of using QQ-plots like they are used in 2^p factorial designs. (And I believe, but I am not sure, that the half normal plots are used just like the QQ-plots.)
noetsi (12-15-2015)
I have not worked with random effects outside multilevel models, but I will look at that. I have not heard of ANOM at all, but I will certainly look at that
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
Tweet |