1. ## What test to compare bacterial growth in three areas with repeating variables?

I have a dataset of number of bacterial colonies grown from 11 water samples taken from 3 different watersheds. I want to know if the land area, population, number of waste water treatment plant outfalls, and number of septic tanks in the watershed have any effect on the bacterial growth. The samples are skewed, with 8 from one watershed A, 1 from B, and 2 from C.

Essentially I am trying to compare 3 treatments by categorical numbers, since the land area is one of three numbers, right? I am confused and don't know which test to use, please help! I've attached a file of the data table if this post is confusing. I am grateful for any advice.

Summarizing your question: You have 4 variables and you want to know whether they are related to bacterial growth.

Looking at your data, for all variables have 3 distinct values, corresponding 100% with with land location. So in fact you only have 3 groups, regardless of which variable you choose to group on. So you basically just want to compare the growth between A, B and C. Which variable is responsible for any difference in growth on each site can not be determined, you just know that there is a difference between the sites.

But the pain in the arse is your sample size per location. For A you are fine with 8 (HIGHLY VARIABLE!!) measurements, but for B and C (n=1 & 2) you can simply not test any association between site (=all variables in this case) and growth.

I would make a box plot of the measurements on site A and scatter plots of the measurements on site B and C. Site C seems to be relatively low, but still within the range of A, while B is much higer (single measurements). This visualization will emphasize this without assigning any statistical value (significance).

Best, Martin

