- Thread starter yoma819
- Start date

the only problem is i dont seem to be having much luck.

to try and normalise the gamma distribution i have tried:

square

square root

log

but to no avail.

this is the first time i have done data transformation and would appreciate any help

cheers

Yoma

i have run the tests.

basically i ran it through minitab which told me that my 4 sets of data were not normally distributed then i ran it through easyfit which told me which distributions fitted my data.

have a look:

http://s884.photobucket.com/albums/ac50/yoma819/

cheers

yoma

So what are you trying to do with this data? Why do you feel it needs to be normally distributed?

basically what i was trying to find out how to do in this post:

Code:

`http://talkstats.com/showthread.php?t=13096`

i need the data to be normally distributed as this is an assumption of a glm!

the only problem is i dont seem to be having much luck.

to try and normalise the gamma distribution i have tried:

square

square root

log

but to no avail.

this is the first time i have done data transformation and would appreciate any help

cheers

Yoma

why would i be only interested in the normality of my residuals?

I guess one way to see why this is what we care about is consider we are comparing two groups.

Code:

```
#data from first group
y1 <- rnorm(100,5)
#data from second group
y2 <- rnorm(100,100)
#data overall
y <- c(y1,y2)
hist(y) #clearly not normal
hist(y1) #once we adjust though they look normal
hist(y2)
```

The only reason we even care is because one of the assumptions we make when deriving the theory is that the errors are normally distributed. We don't care how the data itself is distributed because we say that once we adjust for our predictors the errors/residuals will be normally distributed.

I guess one way to see why this is what we care about is consider we are comparing two groups.

Clearly the overall data isn't normally distributed... but who cares. Once we look at each group individually they look normal so we're all right. This is why we only care if the residuals are normally distributed.

I guess one way to see why this is what we care about is consider we are comparing two groups.

Code:

```
#data from first group
y1 <- rnorm(100,5)
#data from second group
y2 <- rnorm(100,100)
#data overall
y <- c(y1,y2)
hist(y) #clearly not normal
hist(y1) #once we adjust though they look normal
hist(y2)
```

i take it in your R code you are generating random data and putting it into y1?

Code:

`y1 <- rnorm(100,5)`

Code:

`y2 <- rnorm(100,100)`

Code:

`y <- c(y1,y2)`

sorry just trying to understand your R code!

cheers

Yoma