Frequent Statistical Misunderstandings

Dason

Ambassador to the humans
#1
Hola.

I know TheEcologist is working on an FAQ to put up at some point but I didn't think it would be a bad idea to compile a list of frequent misunderstandings that we see.

I've been seeing quite a few people making the mistake of assuming that in a linear model we expect the predictor variables to be normally distributed. Or seeing that they expect the response itself to be normally distributed. This is wrong, of course, because we don't make any of those assumptions but instead assume that the error term is normally distributed.

So what other misunderstands have you come across quite a bit (either in real life or here at Talk Stats)?
 

trinker

ggplot2orBust
#2
There seems to be a bit of misunderstanding about the difference between an effect size or strength of effect measure and the test of significance. I've clarified this several times myself.
 

CB

Super Moderator
#3
Hola back. I think misinterpretations of p values are a big one - most often, misinterpreting a p value as the probability that the null hypothesis is true, or as the probability that the effect observed is "due to chance".

Also, misinterpreting 95% confidence intervals as indicating that there's a 95% probability that the true value of the parameter lies within the given interval. I found this one in a textbook yesterday, actually :eek:
 

spunky

Can't make spagetti
#6
i've got one!!! (actually i've got a bazillion) but there is one... one that refuses to die and haunts me the in hallway on my way to my office every single day...

PEOPLE OF THE WORLD!! if you want to do an ANOVA-like analysis using OLS regression.... and you want to add an interaction term... the interaction term is NOT the product term of dummy-coded variables you HAVE to use contrast coding!!!
 
Last edited:

CB

Super Moderator
#7
Also, misinterpreting 95% confidence intervals as indicating that there's a 95% probability that the true value of the parameter lies within the given interval.
I've been pondering this one a little more. I still find I have a bit of trouble explaining clearly to others why this interpretation is incorrect and what the correct interpretation is. Questions, then:

1) How do you guys like to explain the definition of a confidence interval when explaining the concept to a student or newbie?
2) In frequentist statistics, we can't say there is a 95% probability that the true parameter falls within the 95% CI, because the true parameter is a constant rather than a random variable. But what if one is philosophically entirely comfortable with a subjectivist interpretation of probability? Would it be reasonable to say "Well, I'm not using a proper Bayesian calculation right now, but I've got no problems with subjectivist probability: So I'll interpret the confidence interval as meaning there's a 95% probability the parameter falls within the interval." Would this make any sense at all?
 

Dason

Ambassador to the humans
#8
1) I tend to explain a confidence interval through the idea of repeated sampling. The confidence interval will contain the true parameter \((1-\alpha)\%\) of the times the process is repeated yadda yadda yadda. I typically accompany that with some simulations and some graphics to show what is going on. Is it the best way? Probably not but it's how I did it.

2) I don't think that is theoretically sound. I can't quite put my finger on it right now. But I think the problem would be that you're missing the prior. And in the Bayesian sense you use the likelihood to update a prior to get to a posterior distribution - but the thing is that the likelihood itself isn't giving you any probability - it's just telling you how to update your subjective probability. It sounds like if we do what you're suggesting that we end up treating the likelihood in a way that it wasn't intended to be used. Now it's a minor quibble because if we use an uninformative prior we'll essentially get the same interval as the frequentist approach (at least for reasonable sample sizes and for 'uninformative' enough priors) but if we're looking to have a philosophically and theoretically sound argument I don't quite think your approach works.

Like I said I didn't quite work it all out - I just don't feel like it's right. But really - if you want to evaluate your intervals like that in the end... why not just go the Bayesian route to begin with?
 
#9
well i would add that the "linear" in linear regression means that the structural part E(y|x) (i.e. excluding the error term) of the model is linear in the parameters. There is no implication that E(y|x) is a linear function of x.

I also think it would be nice to see evidence that the error term in their linear model is normally distributed.
 

noetsi

No cake for spunky
#10
To me the most common misunderstanding is that people believe if a test is statistically signficant it means the effect size is important. Or, this a problem especially when you have low power, that an effect size is not signficant just because the test of signficance is not.

Another point is whether you should even use a test of statistical signficance if you have the entire population (which I do in my analysis all the time). I don't think you should even run statistical test then, just decide if the effect size is substantively large enough to matter. But we do run the test commonly.
 

Dason

Ambassador to the humans
#11
Another point is whether you should even use a test of statistical signficance if you have the entire population (which I do in my analysis all the time). I don't think you should even run statistical test then, just decide if the effect size is substantively large enough to matter. But we do run the test commonly.
The trouble here I feel is that the "population of interest" might not be always be as well defined as we hope.
 

Jake

Cookie Scientist
#12
i've got one!!! (actually i've got a bazillion) but there is one... one that refuses to die and haunts me the in hallway on my way to my office every single day...

PEOPLE OF THE WORLD!! if you want to do an ANOVA-like analysis using OLS regression.... and you want to add an interaction term... the interaction term is NOT the product term of dummy-coded variables you HAVE to use contrast coding!!!
Hmm... I think maybe you misspoke here? The interaction of two dummy-coded variables is demonstrably equivalent to the interaction of two contrast-coded variables. An example using a random data set sitting on my office computer (which I apparently decided to give the helpful and informative name "subjects"...) follows:
Code:
> # contrast codes
> subjects$group_contrast <- recode(subjects$group, "'g'=-1/2; 'kg'=1/2;", as.factor.result=F)
> subjects$gender_contrast <- recode(subjects$gender, "'male'=-1/2; 'female'=1/2;", as.factor.result=F)
> summary(lm(ave0 ~ gender_contrast*group_contrast, data=subjects))

Call:
lm(formula = ave0 ~ gender_contrast * group_contrast, data = subjects)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.80283 -0.30729  0.04919  0.58290  1.54902 

Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
(Intercept)                      5.5623     0.1740  31.968   <2e-16 ***
gender_contrast                 -0.1209     0.3480  -0.347    0.730    
group_contrast                   0.5022     0.3480   1.443    0.157    
gender_contrast:group_contrast  -0.3567     0.6960  -0.512    0.611    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.9208 on 38 degrees of freedom
Multiple R-squared: 0.06074,	Adjusted R-squared: -0.01342 
F-statistic: 0.8191 on 3 and 38 DF,  p-value: 0.4914 

> 
> # dummy codes
> subjects$group_contrast <- recode(subjects$group, "'g'=0; 'kg'=1;", as.factor.result=F)
> subjects$gender_contrast <- recode(subjects$gender, "'male'=0; 'female'=1;", as.factor.result=F)
> summary(lm(ave0 ~ gender_contrast*group_contrast, data=subjects))

Call:
lm(formula = ave0 ~ gender_contrast * group_contrast, data = subjects)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.80283 -0.30729  0.04919  0.58290  1.54902 

Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
(Intercept)                     5.28241    0.46038  11.474 6.54e-14 ***
gender_contrast                 0.05746    0.51169   0.112    0.911    
group_contrast                  0.68056    0.61767   1.102    0.277    
gender_contrast:group_contrast -0.35665    0.69597  -0.512    0.611    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.9208 on 38 degrees of freedom
Multiple R-squared: 0.06074,	Adjusted R-squared: -0.01342 
F-statistic: 0.8191 on 3 and 38 DF,  p-value: 0.4914
The products here are clearly the same (in general they are equivalent except for arbitrary changes in the sign and magnitude of coefficient, which of course are perfectly balanced by commensurate changes in the variance of the predictors). I think maybe what you meant to say is that in the presence of an interaction term, the dummy coded variables are manifestly not the same as the "main effects" in an ANOVA. Of course, this is also true when using contrast codes, but in a far less obvious and arguably less important way.
 

spunky

Can't make spagetti
#13
I think maybe what you meant to say is that in the presence of an interaction term, the dummy coded variables are manifestly not the same as the "main effects" in an ANOVA
that's exactly the point i was trying to make, but i guess i never really gave a full development of the idea as you did with R code and everything, thank you for including it. i just have issues everytime i see students (or just people who need stats advice in general) coming to my office and saying stuff like "if regression and ANOVA are kind of like the same then why i'm not getting the same results" in which case i have to go over the details of coding and what it means to do it one way or another... multiply the number of people with the same question by the number the years i've been TAing and at some point you're like "you know what... just... just stick to this" i think that's the same way Kaiser's eigenvalues greater than 1 was also invented :p
 

Jake

Cookie Scientist
#14
Heh... speaking of which, just two days ago I heard someone lament that a factor analysis on some of their scale data had been decidedly uninformative because "like 10 or 15 of the eigenvalues were greater than 1!" I didn't want to derail the meeting (or his self esteem!) by pointing out that that was a pretty silly thing to say, so I had to bite my tongue...
 

spunky

Can't make spagetti
#15
Heh... speaking of which, just two days ago I heard someone lament that a factor analysis on some of their scale data had been decidedly uninformative because "like 10 or 15 of the eigenvalues were greater than 1!" I didn't want to derail the meeting (or his self esteem!) by pointing out that that was a pretty silly thing to say, so I had to bite my tongue...
i have an article somewhere called something like "Kaiser's little jiffy" or "on the birth of Kaiser's little jiffy"... do you want me to send it to you? it's kind of a humurous way of how this thing began as a joke between kaiser & his friends and now it's the more over used (and over abused) technique to decide on the number of factors... you could show it to your friend ;)
 

Jake

Cookie Scientist
#16
Sounds like a fun read. If you have a full citation I'm sure I can get my hands on it, otherwise a PM attachment would be great.
 

spunky

Can't make spagetti
#17
ok... so i wanted to upload it for everyone to see but, apparently, the pdf is too big for the forum... :( anyways, citation is:

Kaiser, H.F. (1970). A second generation little jiffy. Psychometrika, Vol. 35, No. 401-415

(i know my APA style is off but there's all the info you need right there i'm sure)
 

noetsi

No cake for spunky
#18
The trouble here I feel is that the "population of interest" might not be always be as well defined as we hope.
In many cases, like say demographic or economic data, I would agree. But I work a lot with the people we serve (which is our entire population) and legally and practically there is no question we know who that group is entirely and all pertinant data I run. If not someone is in a lot of trouble:)
 

Dason

Ambassador to the humans
#19
I still say it can be more complicated than you're making it but it depends on what you're trying to do.

For example even if I was interested in some information about everybody in the city of Ames - even if I explicitly state which people I'm talking about (because this population changes over time and it can be pretty fuzzy by what I mean by "everybody in the city of Ames") that doesn't mean that the thing I'm interested in is always fixed. Sometimes it is but sometimes it isn't. Especially if we're talking about opinions - I could get everybody's opinion at a certain time but would I really have a complete representation of my population? The opinion can change! Even though the subjects in the population are the same... the population is changing.

Now somebody might claim that since I've collected information about every subject in the population I don't need to do statistics - but that's only true if I only care about the exact moment that I collected the data and the exact group of people that were in my study.

This is true is some cases but I just wanted to make the point that even if you get information about all the subjects in your 'population' you might still want to do some sort of inferential statistics. Now thinking this way adds a lot of complications to what you're trying to do but hopefully you could see why somebody might not just stop at using descriptive statistics when they collect information from an entire "population".
 

spunky

Can't make spagetti
#20
Now somebody might claim that since I've collected information about every subject in the population I don't need to do statistics - but that's only true if I only care about the exact moment that I collected the data and the exact group of people that were in my study.
or i think we can also add to what you mentioned the assumption that whatever i am measuring is measured without some sort of systematic error (or perhaps error in general, but i'd need to think more about that one) that may bias our estimates one way or another... just as you said, inference is something good to keep in mind...