# Normality: DV itself or residuals?

#### ooostats

##### Member
I honestly don’t think I trust any psych/social science paper from before 2011 when the Replication Crisis started making its rumbles.
I guess I tend towards this opinion as well. Although I also think that this gets blown way out of proportion: Some sub-fields are much worse than others, and in different ways (e.g. theoretical vs statistical vs experimental design & control), and journal companies are def not helping given their preference for p < .05!

Definitely feels like many people/researchers pretend to care, but then see the equations that you posted itt and decide it's a problem for someone smarter to solve

#### Karabiner

##### TS Contributor
So… I don’t know about the “one wouldn’t care anyway” part. *UNLESS* we are really *only* talking about the sample mean and nothing beyond that.
This is a good idea, to keep in mind that the distribution and standard error issues might have different
impact, depending on e.g. which parameters are concerned.
So, to be honest, what I do these days is to advocate for training in simulation methods OR pairing up with people who can do simulations for you. Especially when it comes to power analysis and stuff like that. That would be my rule of thumb: check by simulation first.
Genereally, this is a very good point. Starting from simulated populations and then taking a large
number of samples, simulation gives me an idea what can go wrong under defined circumstances,
or what presents itself as robust. But I am not sure how to apply it in order to check from which
distribution a given sample was drawn.

With kind regards

Karabiner

#### spunky

##### Doesn't actually exist
I guess I tend towards this opinion as well. Although I also think that this gets blown way out of proportion: Some sub-fields are much worse than others, and in different ways (e.g. theoretical vs statistical vs experimental design & control), and journal companies are def not helping given their preference for p < .05!

Definitely feels like many people/researchers pretend to care, but then see the equations that you posted itt and decide it's a problem for someone smarter to solve
Which is why I always advocate for collaborative work. For example, I am not a social psychologist. I wouldn't even dream of trying to set up a study on my own running it to get a publication. If I'm interested in something, I'd perhaps knock on a few people's doors and see if I can get involved as the "data analysis guy". Nevertheless, for some reason, that is not a two-way street and I can see a lot of people trying to do what I do and do it badly. This has a double pernicious effect: on the one hand, things get done in pretty sub-optimal ways. On the other, perfectly capable people with PhDs find themselves un/underemployed because almost everybody feels sufficiently capable of running their analyses on their own. Everybody loses!

I do have a little glimmer of hope that since the Replication Crisis is just gaining more and more momentum (albeit INCREDIBLY slowly), more and more researchers are realizing that people with expertise solely in quantitative methods are valuable.

#### spunky

##### Doesn't actually exist
This is a good idea, to keep in mind that the distribution and standard error issues might have different
impact, depending on e.g. which parameters are concerned.
This is very good advice. I remembered our (online) conversation yesterday when I was sitting on a 4th year undergrad course in econometrics. They were talking about instrumental variable (IV) estimators and deriving one of the 'simplest' forms of the standard error for this type of regression coefficients. As it turns out (and this blew my mind a little bit) the standard error for IV estimators has a sample size term 'n' both on the numerator AND the denominator. So, as sample size goes to infinity you can have something that is asymptotically normal **BUT** it is no consistent. The standard error never goes to 0 just by letting the sample size increase arbitrarily. The only way it goes to 0 is if you have both a large sample size AND what is called a 'strong' or 'valid' instrument. And those are hard to come by because they cannot be selected exclusively on the basis of statistical considerations alone.

So yeah... once you step outside the very basics, there is a lot of weirdo stuff that goes on.

Genereally, this is a very good point. Starting from simulated populations and then taking a large
number of samples, simulation gives me an idea what can go wrong under defined circumstances,
or what presents itself as robust. But I am not sure how to apply it in order to check from which
distribution a given sample was drawn.

With kind regards

Karabiner
Well, I mean... it is not perfect but I would say a good first step would always be to fit a distribution to the data. For example, let's play God for a moment and pretend that our dataset somehow came from a gamma distribution with shape and rate parameters 1.So $$X \sim G(1,1)$$ and the sample size is $$n=1000$$

After some exploration and plotting, one could do something like this:

Code:
library(fitdistrplus)

g <- rgamma(n=1000, shape=1, rate=1) ###hypothetical dataset we collected

summary(fitdist(g, "norm"))
Fitting of the distribution ' norm ' by maximum likelihood
Parameters :
estimate Std. Error
mean 1.012525 0.03181838
sd   1.006186 0.02249889
Loglikelihood:  -1425.105   AIC:  2854.21   BIC:  2864.026
Correlation matrix:
mean sd
mean    1  0
sd      0  1

summary(fitdist(g, "gamma"))
Fitting of the distribution ' gamma ' by maximum likelihood
Parameters :
estimate Std. Error
shape 1.016245 0.04007357
rate  1.003637 0.05057144
Loglikelihood:  -1012.363   AIC:  2028.726   BIC:  2038.542
Correlation matrix:
shape      rate
shape 1.0000000 0.7825828
rate  0.7825828 1.0000000
The first part tries to fit a normal distribution and estimates (via MLE) what the most likely parameters for this dataset would be. The second part fits a gamma distribution and also estimates the parameters. You can see by looking at the information criteria that the gamma distribution provides a MUCH better fit than the normal. And it estimates the most likely parameters for both distributions (and does a pretty good job at it because my sample size is large).

I'd imagine doing something like that at the first stages of analysis. And the fitdistrplus package provides a wide array of plotting techniques and methods to try and 'guesstimate' what the most likely distribution generated your data.

I dunno why we don't teach stuff like this in my turf in social-science-land. I think a lot of better data practice could come from trying to let the data speak for itself as opposed to assuming the normal distribution everywhere and then add patches and corrections to our analyses so it fits the normal model.

#### noetsi

##### Fortran must die
"Dr. Leona Aiken from Arizona State University has done a lot of interesting research on this in psychology, showing some pretty convincing (and damning) evidence that, at least in the United States, people with PhDs in Quantitative or Mathematical Psychology are undervalued."

So we are talking about what, 500 people in the US?

I don't know what undervalued means. My PHD professor said that the most important thing in getting hired for a PHD level position was if the people who hired you, your fellow professors, enjoyed their interaction with you because they were going to have to live with you for a lifetime given tenure. As someone who spent the last decade trying to get better at stats (and spent enough time at universities to earn 4 graduate degrees) my question would be do these individuals generate enough useful findings that significantly contribute to anything practical. If they don't how are they undervalued.

My guess, in my life after college, is that relatively few organizations are interested enough in formal methods for even geniuses in that area to make a big difference. And there are so many potential errors in methods (many outside the method such as measurement error) that one might question if the methods actually produce good results. Or at least are we sure they do. How many times do you run into an article that says the prevailing method of doing something is wrong? I have run into this so often that I begin to doubt the value of running statistics (admittedly if I was better at it I might have fewer doubts).

I read an article once about the elite medical journal once that pointed out just how badly logistic regression was being presented (such as confusing odds ratios I think with relative risk which is rarely the same thing). And they were the best medical journals ...

Last edited:

#### ooostats

##### Member
There are lots of interesting points raised in this thread that I'll no doubt be going back to. One thing that I'm a little confused about though is what you said (@spunky):

Well… a lot of Machine Learning stuff is more interested in prediction rather than inference. And parametric assumptions are most important for matters of inference.
Why are parametric assumptions only really important for inference and not modeling in general? For example if I have two variables that are non-linearly related and try to fit a linear regression model (with no interest in significance), surely my violation of that assumption means that the model fitted is not appropriate and it would probably suck.

#### spunky

##### Doesn't actually exist
There are lots of interesting points raised in this thread that I'll no doubt be going back to. One thing that I'm a little confused about though is what you said (@spunky):

Why are parametric assumptions only really important for inference and not modeling in general? For example if I have two variables that are non-linearly related and try to fit a linear regression model (with no interest in significance), surely my violation of that assumption means that the model fitted is not appropriate and it would probably suck.

But 'linearity' is not a parametric assumption. The issue that you bring forward (linearity) is an issue of model misspecification. Regression coefficients preserve their small sample properties even in the absence of normally-distributed errors because of the Gauss-Markov theorem

#### noetsi

##### Fortran must die
Well the parameters may be unbiased, but I don't think the test values will be correct if the data is non-normal and you have a small sample size.