# Is my variable count or measurement and what statistical model would be best?

#### akilles88

##### New Member
I am trying to assess the fine-scale spatial distribution of tadpoles along a depth gradient in relation to abiotic and biotic factors -- and am having great difficulty in determining the best statistical test (Chi square VS ANOVA) to use BECAUSE I cannot determine what kind of data I have.

In short, my experimental design involves sampling along multiple transects in a pond.
Along these transects, I've established THREE depth zones from which samples are taken: (1) Edge, (2) Shallow, and (3) Deep. So, for every transect (n=5) there are three zones (k=3), giving a total sample size of 15. A sampling tube of known area is pushed through the water column into the substrate to isolate all macro organisms in the water column. Once a sampling column is established, temp (at the bottom of the column), D.O., depth, and pH are measured. Then, tadpoles and dragonfly nymphs are collected from the tube sampler and counted. THESE variables are killing me.

So - in theory, this design should allow me to assess the fine spatial distribution of tadpoles based on how many were found in the three depth zones - as well as relate the non-uniform distribution to variations in abiotic and biotic factors.

Are my data referring to the number of tadpoles and dragonfly larvae collected from the tube sampler considered a measurement variable (meristic) or categorical?

So far, I have used a One-Way Anova with Tuk's HSD to determine if the average number of tadpoles collected across the three depth zones are significantly different, followed by Tuk's to determine which one is significantly different.

Is my use of the ANOVA invalid considering the nature of my variable, or should I just use the Chi square test where the null: the number of tadpoles found in the three depth zones are uniform, OR can i add a constant to my data (I do have zeros for some values) and use a square root transformation to strengthen the results of the ANOVA? I could also convert the number of tadpoles found to density of tadpoles because I am sampling with a tube of known area (BUT VARYING VOLUME), but would this frowned upon?

Thank you for all you help, I am really struggling with this and will appreciate any type of contribution.

#### akilles88

##### New Member
.... is the description of my problem adequate - or do I need to elaborate some more?

##### Ninja say what!?!
It's going to be very hard to get any real finding with only 15 obs. Is it possible to get more?

If you're dealing with the NUMBER of larvae, then your variable is numerical. ANOVA seems fine to me from what you described, but you should still get more data. Once you do, check to see whether the counts are normally distributed among each depth.

You should try using Poisson regression too, if you are familiar with it.

#### akilles88

##### New Member
The pond itself is not very large, and the nature of sampling is destructive. If I were to try to re-sample along those transects there would be a chance of previous disturbance affecting the natural distribution of tadpoles.

I've recently come across the Poisson regression, and it DOES seem like a very useful tool in generating explanatory models of tadpole distribution.
Please correct me if I'm wrong, but I believe it can take categorical values as the MODEL (i.e. depth zones) as well as continuous values as COVARIATES (i.e. temp, D.O., pH, depth <- although this may be redundant). Is there a way to include dragonfly nymphal counts in the regression as well? Ruling out predator avoidance was included in several of my a priori hypotheses. I originally planned to use multiple regression to generate the global model as stated above, then compare the global model to simplified models using AICc.
Since that would not be valid, I believe I can use QAICc to measure models generated by P regressions.

#### antonitsin

##### New Member
what is a response variable?
Use poisson regression if you response is counts, (zero-inflated Poisson model if you respons has lots of sero's)
otherwise you just have discrete numreical predictor which can be used in any regression if all other assumptions are satisfied

#### TheEcologist

##### Global Moderator
Hi akilles88,

Let me get this straight. You have 5 transects, with each three categories. If the transects are viewed as independent, you have 5 replicates per depth zone {this hints towards a Kruskal-Wallis, or a Poisson regression as suggested by antonitsin (although I doubt it is needed)]. If transects cannot be viewed as independent, you use the total number found per depth over all transects and conduct a Chi-square.

When should transects be independent? Well for example when the distances between then are large enough so the risk of sampling the same tadpoles again is zero or that the risk of disturbing the next transect by sampling the current is also negligible. You can treat each transect as an independent measurement.
When are the transects not independent? Well for instance you sample the same transect 5 times, or sample multiple times in the same small pond. You should sum your results.

Here are some useful webpages to answer the following questions:

What biological variables do I have?
http://udel.edu/~mcdonald/statvartypes.html

What test should I use?
http://udel.edu/~mcdonald/statbigchart.html

#### akilles88

##### New Member
The Ecologists,

Yes, I intended for the transects to be treated as independent. In my exp. design I followed very strict protocols during sampling to minimize any type of disturbance that would disrupt the natural distribution of organisms. The following are the major protocols that were established: (1) No transect may be sampled more than once per sampling day, (2) only three samples may be taken from the pond per samp. day, (3) each subsequent sample must maximize distance between the last and next sample of that sampling day, (4) no transect*depthzone combo may be sampled more than once for the entire sampling period. Also, sampling tubes were transported by boat so that time actually spent in water only occurs during extraction of the contents of the sampling tube.

I believe the fine-scale distribution of tadpoles will be primarily influenced by a thermal gradient - especially in the case of Lithobates clamitans clamitans. Since sampling took place during late fall to early autumn in a temperate dimictic pond, thermal stratification should still be present. Dissolved oxygen and temperature have been shown in laboratory experiments to influence the growth and behavior of tadpoles; affecting development and predation rates. Thus, tadpole should favor shallow areas that provide warmer temps as well as higher DO concentrations due to wind driven mixing. While all tadpoles should seek these areas, intraspecific competition for these resources should produce a gradation of tadpole density where they decrease as depth increases.

Thank you so much for your comments. I am always appreciative of any knowledge I can get from other ecologists (at least I hope to be one day).

#### TheEcologist

##### Global Moderator
Hi akilles88,

Glad I could help, I hope you find an appropriate analyses. Try the suggestions from the others and mine. Post back if you are uncertain.