# Nested/Hierarchical ANOVA

#### patnvkelly

##### New Member
Hi,

I am trying to determine which is the correct test to use for my research.

Briefly, I am carrying out forensic research where I am measuring the quantity of DNA in different locations in peoples vehicles, I selected the same 5 locations around the driver area of 20 different vehicles. (steering wheel, door handle, indicator/wiper, handbrake/gear stick & central console). I carried out a 1 way ANOVA to compare the different groups of areas with each other and found no statistical difference between the areas. It has been brought to my attention that this may not be the correct test to carry out and that as my locations are 'nested' within the vehicles that I may need to do a different test. Do I also need to compare DNA quantities as a whole between vehicles? Is has been suggested that a nested ANOVA might be the correct test or a MANOVA.

Any help or advice on this would be very much appreciated
Thank you
Pat

#### Miner

##### TS Contributor
If your sole interest is in the differences between locations within a vehicle, and not between vehicles, you can subtract each vehicles mean value from the individual values for that vehicle. For example, Steering wheel (vehicle1) - Mean (vehicle1); door handle (vehicle1) - Mean(vehicle1), etc. This will remove the vehicle to vehicle variation and focus on the location differences. You can then use the 1-way ANOVA. This is similar to how a paired t-test uses the differences between pairs to remove subject to subject variation.

#### patnvkelly

##### New Member
If your sole interest is in the differences between locations within a vehicle, and not between vehicles, you can subtract each vehicles mean value from the individual values for that vehicle. For example, Steering wheel (vehicle1) - Mean (vehicle1); door handle (vehicle1) - Mean(vehicle1), etc. This will remove the vehicle to vehicle variation and focus on the location differences. You can then use the 1-way ANOVA. This is similar to how a paired t-test uses the differences between pairs to remove subject to subject variation.
Hi thank you for your response and solution, and actually I am just really interested in the differences between the vehicles. If I subtract the mean like you suggest a lot of my values will end up being negative, is this a problem. Apologies for my statistical ignorance!
Thank you

#### Miner

##### TS Contributor
I'm a little confused. In your first post, you analyzed the differences in location within a vehicle. Now you are saying that you are interested in the differences between vehicles. What is your study hypothesis?

#### patnvkelly

##### New Member
Hi,

Sorry for the confusion I have probably not been very clear!

I grouped all the steering wheels together (20 in total from all the different vehicles), all the door handles together (20 in total), all the central consoles together (20 in total) and so on, and then compared each of these groups using a 1 way ANOVA. I didn't take into consideration which vehicle they came from. I wanted to see if any of the locations yielded more DNA than the others or even there was no statistical difference.
Does this make sense?
thanks

#### Miner

##### TS Contributor
I wanted to see if any of the locations yielded more DNA than the others or even there was no statistical difference.
So, your (HA) hypothesis is that there is a difference between locations? Vehicles are simply replicates?

#### patnvkelly

##### New Member
Yes that is correct, it was not the main point of my study but something extra that I thought would be of interest.

#### Miner

##### TS Contributor
Okay, I will refer back to my first response then. If you subtract the vehicle mean from the locations results for that specific vehicle, you will remove the vehicle to vehicle variation from the analysis. Yes, that will give you a lot of negative values, but that is not a problem because the differences between locations have not changed. Repeat the 1-way ANOVA on the new set of data.

A nested ANOVA will not work. It will tell you if there are differences between vehicles, but without replicates at each vehicle location, it will use the location variation as the estimate of experimental error.

MANOVA will tell you whether differences exist between vehicles for one or more of the locations, but not whether there are differences between locations.

#### patnvkelly

##### New Member
So just to be sure I have this right, for example my steering wheel in vehicle 1, I should subtract the mean of all the locations in vehicle 1 from the steering wheel? and follow on and do this for all my locations?

When I first did the ANOVA I had to transform my data (lg10) because it was not evenly distributed, will having a lot of negative values cause any issues with this?

Is there any test I could use to examine the variation between locations and also the variation between vehicles, just out of interest.

I appreciate all your help with this

#### Miner

##### TS Contributor
So just to be sure I have this right, for example my steering wheel in vehicle 1, I should subtract the mean of all the locations in vehicle 1 from the steering wheel? and follow on and do this for all my locations?
Correct.

When I first did the ANOVA I had to transform my data (lg10) because it was not evenly distributed, will having a lot of negative values cause any issues with this?
I would run the 1-way ANOVA first without transforming the data and look at the residuals plot. If the residuals vs. fitted values plot does not show an unusual dispersion pattern suggesting heteroskedacity (see plot), you do not need to transform the data. Transforms always result in a loss of information, so I prefer not to use them.

If you still need to log10 transform the data, you will need positive values, so just add the same constant (e.g., 10, 20, 100, etc.) to all of the values. This constant just needs to be large enough to make the smallest number a positive (i.e., nonzero) value. This will not affect the results of the ANOVA as far as the p-values when testing for a significant effect. However, if you use a regression approach, it will affect the constant term.

Is there any test I could use to examine the variation between locations and also the variation between vehicles, just out of interest.
Only if you have replicate measurements for each location of each vehicle. The usual test would be the nested ANOVA, but it will not work as you need without the replicates. Maybe, another contributor can suggest an alternate of which I haven't thought.

You are welcome.

#### patnvkelly

##### New Member
Thank you again for all your help, it is very much appreciated. Hopefully I will not need to come back for more!!

#### patnvkelly

##### New Member
Hi
I have a few more questions if that is ok?

Here is my residual plot for my data. When I do tests for normality (kolmogorov & Shapiro-Wilk) 4 of my groups have a p<0.05, so from that my data is not normally distributed. Even when I transform the data it does not help. Is it possible to do an anova when the data is not normally distributed? I have read that if my groups are large enough it is ok to still do anova? my groups have 20 in each

If anova is not possible should I then carry out a non parametric test, Kruskal-Wallis test?

Thanks again

#### Miner

##### TS Contributor
ANOVA is extremely robust against non-normality. With 100 data points, you have nothing to worry about. The only question that I would ask about this plot is the one point in the upper right. Since these are standardized residuals, that point is about 5.5 standard deviations from the predicted value, which makes it an outlier. I doubt that it will change your conclusions, but it is worth investigating.

#### patnvkelly

##### New Member
Hi again,
Below is the query that was put to me about my original analysis, with the help and suggestions you have given me I have redone the ANOVA test and my overall result has not changed, do you think I have answered the query correctly?

Thanks

"An analysis of variance showed that the effect of different vehicle areas on DNA quantification results was not significant, F (4, 95) = 1.64, p = 0.171."
As the locations are 'nested' within the vehicles, it is unlikely that the assumption of independence holds. I would suggest that the test needs to simultaneously take vehicle number as well as location number into account.

#### Miner

##### TS Contributor
I hesitate to confirm this simply because I do not have full knowledge and understanding of all that you have done or not done in this study/analysis. I will say that there does not appear to be anything in this statement that contradicts what we discussed.