Interaction term with no biological significance

lken

New Member
#1
Hello friends,

I've used a negative binomial GLM to look at how invertebrate abundance (over dispersed) is affected by seagrass density.

There is a negative interaction between seagrass density and epiphyte biomass (the little plants growing on seagrass blades that invertebrates eat).

My model implies that when epiphyte biomass is low, seagrass density increases the number of invertebrates. When epiphyte biomass is high, increasing the number of shoots decreases the number of invertebrates exponentially. This interaction does not make biological sense to me any way I think about it.

When I remove the top and bottom 15% of the dataset, the interaction term is no longer significant. This leads me to believe that there are some outliers in the dataset that are creating this interaction. But I'm not sure where to go from here. Can I justify not including the interaction without being bias?

With interaction included:
Null deviance: 136.532 on 74 degrees of freedom
Residual deviance: 90.627 on 70 degrees of freedom
AIC: 1043.1

With no interaction:
Null deviance: 122.169 on 74 degrees of freedom
Residual deviance: 90.996 on 71 degrees of freedom
AIC: 1051.2
 

Miner

TS Contributor
#2
I know absolutely nothing about your field, so I am speaking strictly from a logic perspective.

First, your title implies that this interaction does not make theoretical sense. If theory is well established in your discipline, you may want to put a lot of weight on this and recognize that this may be a Type 1 error. If theory is still growing, don't be too hasty to discard it, because this might be a real breakthrough. In my field of industrial statistics, I occasionally see significant interactions that are physically impossible. Type 1 error strikes.

Second, never remove 30% of your data. Removing the extremes is removing most of the signal (effect) and none of the noise. See attached example. Removing that much data weakens the relationship.

Also, outliers (unless they are caused by measurement errors) are often a sign of lurking variables that are not part of your experimental design.
 

bugman

Super Moderator
#3
Just putting some thought around the ecological aspects of this.

can you show a plot of seagrass density vs invertebrates with high and low epiphyte biomass.

Are your abundance data standardised by sampling effort or area?

This might indicate that habitat rather than food recourses are limiting.
Epiphytes typically smother (no experience with seagrass here) habitat area, but also provide a food resources. If the two are mutually exclusive, this would imply that as your epiphyte biomass increases, your habitat availability decreases, but also I suspect there would be some relationship between epiphyte cover and photosynthetic rates of the seagrasses, which may add another horrible level of complexity to this.

This may be way off since I am coming from freshwater environments, but just chucking in my 2 cents.
 
Last edited:

lken

New Member
#4
Those comments do make a lot of sense, but adding seagrass shoots should still increase surface area when epiphyte biomass is high (just maybe not as positive as when epiphyte biomass is low). I wouldn't have expected a negative result...

After trying to make the plots you mentioned, I realized only 3 out of 70 seagrass quadrats had more than 1g of dry epiphyte biomass. I was predicting the model over the range of epiphyte biomass (0-4g), but realistically 95% of the time it's below 1.

When I predict the model across 0-1, the effect of shoot seagrass is dampened where there is high epiphyte biomass. This makes more sense to me and could be a result of habitat space as you discussed. Can I get away with showing a graph of invertebrate abundance across seagrass density, with predicted lines for epiphyte biomass ranging from 0-1 (the norm) instead of 1-4? I think my model does not predict well when epiphyte biomass is high because there are only a few outlying points. I don't necessarily want to exclude them though because they likely do fall in a reasonable range.