Hello people,

I would like to ask for you help about my analysis. I explain it briefly : I have daily insect captures from a set of traps located in different places. I'd like to link some biological variables (annual abundance, dates of first capture) to some independent variables : temperature during winter, cultivated surfaces for ex, and also the interactions between the independent variables.

So, my data presents like this :

Trap Year Dep.variable Indep.var 1 Indep var. 2 ...

1 A 1991 881 5000 6

2 A 1992 1341 5000 7

3 A 1993 369 3000 8

...

4 B 1993 367 1000 4

5 B 1994 210 200 2

6 B 1995 109 200 3

...

I precise that my dataset is unbalanced (I don't have the same number of observations between groups) and I would like to take all the data into account.

My first thought was to fit a LM/GLM (or a GAM because I have some nonlinear relationships), like this :

lm(Dep.variable ~ Indep.variable1 * Indep.variable2 + ... + Trap + Year)

I've got some litterature about similar analyses and that's what the authors did most of the time. I've tried it (LM) and my models have fitted not so bad most of the time, but I'm not sure about my procedure. Is it consistent putting Trap and year as fixed effects here ? I understand this may be a naive question but I cannot make my mind with this.

But I also read a bit about panel data, particularly pooled cross-sectional procedures, and it seems this type of analysis would be well suited too, as I'm studying time series.

I precise that what I want is to be explicative rather than predictive (my purpose is to interpret the effects of each independent variable and their interactions, not really to forecast future events). Ideally, I would like to know the "general" effect of each indep. variable, independently of location, but also their local effects.

Though, I lack statistical knowledge and I can't figure out what type of model would be the best suited for such an analysis. Or should I try both models and then compare ?

Could you enligthen me a little ? I'm sorry if my questions aren't really clear, I'm a bit lost with it... Of course I know I'm surely going to have to master these analysis, but I'm not against a bit of help to know which direction to go.Thanks in advance.

I would like to ask for you help about my analysis. I explain it briefly : I have daily insect captures from a set of traps located in different places. I'd like to link some biological variables (annual abundance, dates of first capture) to some independent variables : temperature during winter, cultivated surfaces for ex, and also the interactions between the independent variables.

So, my data presents like this :

Trap Year Dep.variable Indep.var 1 Indep var. 2 ...

1 A 1991 881 5000 6

2 A 1992 1341 5000 7

3 A 1993 369 3000 8

...

4 B 1993 367 1000 4

5 B 1994 210 200 2

6 B 1995 109 200 3

...

I precise that my dataset is unbalanced (I don't have the same number of observations between groups) and I would like to take all the data into account.

My first thought was to fit a LM/GLM (or a GAM because I have some nonlinear relationships), like this :

lm(Dep.variable ~ Indep.variable1 * Indep.variable2 + ... + Trap + Year)

I've got some litterature about similar analyses and that's what the authors did most of the time. I've tried it (LM) and my models have fitted not so bad most of the time, but I'm not sure about my procedure. Is it consistent putting Trap and year as fixed effects here ? I understand this may be a naive question but I cannot make my mind with this.

But I also read a bit about panel data, particularly pooled cross-sectional procedures, and it seems this type of analysis would be well suited too, as I'm studying time series.

I precise that what I want is to be explicative rather than predictive (my purpose is to interpret the effects of each independent variable and their interactions, not really to forecast future events). Ideally, I would like to know the "general" effect of each indep. variable, independently of location, but also their local effects.

Though, I lack statistical knowledge and I can't figure out what type of model would be the best suited for such an analysis. Or should I try both models and then compare ?

Could you enligthen me a little ? I'm sorry if my questions aren't really clear, I'm a bit lost with it... Of course I know I'm surely going to have to master these analysis, but I'm not against a bit of help to know which direction to go.Thanks in advance.

Last edited: