Dear all,
I am currently writing my master thesis on the effect of a certain insecticide on bumble bee colonies. In particular I am testing whether the insecticide affects virus or bacterium concentrations. In many colonies I haven’t found any viruses, but if they are present they can be in high concentrations.
I wanted to use some kind of mixed model reflecting the nested design of the study from which I got the bumble bees.
The study design is hierarchical. 16 fields were paired according to landscape characteristics. In each pair one field was randomly assigned to be treated with the insecticide, while the other is the control field. In each field there are 2 boxes and in each box are 2 bumble bee hives.
This is how my data looks like:
> summary(adults.col)
labno pair field box hive.label hive.real mass.mean
1 : 1 P01 : 8 VR02 : 4 A:32 1:32 1:30 Min. :0.03763
10 : 1 P02 : 8 VR03 : 4 B:32 3:32 2: 2 1st Qu.:0.12208
11 : 1 P03 : 8 VR04 : 4 3:32 Median :0.15135
12 : 1 P04 : 8 VR05 : 4 Mean :0.15637
13 : 1 P05 : 8 VR06 : 4 3rd Qu.:0.18863
14 : 1 P10 : 8 VR07 : 4 Max. :0.27110
(Other):58 (Other):16 (Other):40
mass.sem itd.mean itd.sem SBPV.SQ.mean SBPV.SQ.se
Min. :0.00418 Min. :4.650 Min. :0.06751 Min. : 1.475 Min. : 0.2050
1st Qu.:0.01305 1st Qu.:5.066 1st Qu.:0.14619 1st Qu.: 5.317 1st Qu.: 0.6325
Median :0.01578 Median :5.319 Median :0.17602 Median : 6.553 Median : 2.3139
Mean :0.01686 Mean :5.337 Mean :0.18367 Mean : 28.125 Mean : 5.0554
3rd Qu.:0.01914 3rd Qu.:5.596 3rd Qu.:0.20829 3rd Qu.: 22.837 3rd Qu.: 9.0162
Max. :0.03553 Max. :6.306 Max. :0.34589 Max. :121.000 Max. :14.1775
NA's :58 NA's :58
SBPV.detected.SBPV.detected ABPV.SQ.mean ABPV.SQ.se
Min. :0.00000 Min. : 11.8 Min. : 4.91
1st Qu.:0.00000 1st Qu.: 19.5 1st Qu.: 10.81
Median :0.00000 Median : 33.1 Median : 19.55
Mean :0.09375 Mean : 78668.0 Mean : 44832.98
3rd Qu.:0.00000 3rd Qu.: 242.4 3rd Qu.: 137.59
Max. :1.00000 Max. :393033.3 Max. :223992.04
NA's :59 NA's :59
ABPV.detected.ABPV.detected SBV.SQ.mean SBV.SQ.se SBV.detected.SBV.detected
Min. :0.000000 Min. : 3.885 Min. : 0.095 Min. :0.00000
1st Qu.:0.000000 1st Qu.: 6.848 1st Qu.: 1.917 1st Qu.:0.00000
Median :0.000000 Median : 8.710 Median : 5.082 Median :0.00000
Mean :0.078125 Mean : 34.028 Mean : 23.240 Mean :0.15625
3rd Qu.:0.000000 3rd Qu.: 31.654 3rd Qu.: 22.671 3rd Qu.:0.00000
Max. :1.000000 Max. :150.550 Max. :119.450 Max. :1.00000
NA's :54 NA's :54
box.nested hive.nested
Min. : 1.00 Min. : 1.00
1st Qu.: 8.75 1st Qu.:16.75
Median :16.50 Median :32.50
Mean :16.50 Mean :32.50
3rd Qu.:24.25 3rd Qu.:48.25
Max. :32.00 Max. :64.00
I have been looking for any model that can deal with the zeros and the high concentrations but not really found anything, although I expect that is a common problem. I came across a zero inflated model but I was recommended not to use it because it was for count data and the relatively high SQ values (response variable) of the viruses (ABPV, SBPV, SBV) would not go well with this. I was also told not to use rank tests because of the high number of ties (zeros).
Does anyone have a suggestion what I could use instead?
Thanks in advance.
N.B. I still have to determine bacteria concentrations or SQ values. Afterwards I will be told which fields were treated with the insecticide. For purposes of figuring out how to do this kind of analysis in R I created a vector pseudotreatment, which will be replaced by a vector specifying the real treatment information later on. Pseudotreatment has 2 levels C for control N stands for the insecticide.
I am currently writing my master thesis on the effect of a certain insecticide on bumble bee colonies. In particular I am testing whether the insecticide affects virus or bacterium concentrations. In many colonies I haven’t found any viruses, but if they are present they can be in high concentrations.
I wanted to use some kind of mixed model reflecting the nested design of the study from which I got the bumble bees.
The study design is hierarchical. 16 fields were paired according to landscape characteristics. In each pair one field was randomly assigned to be treated with the insecticide, while the other is the control field. In each field there are 2 boxes and in each box are 2 bumble bee hives.
This is how my data looks like:
> summary(adults.col)
labno pair field box hive.label hive.real mass.mean
1 : 1 P01 : 8 VR02 : 4 A:32 1:32 1:30 Min. :0.03763
10 : 1 P02 : 8 VR03 : 4 B:32 3:32 2: 2 1st Qu.:0.12208
11 : 1 P03 : 8 VR04 : 4 3:32 Median :0.15135
12 : 1 P04 : 8 VR05 : 4 Mean :0.15637
13 : 1 P05 : 8 VR06 : 4 3rd Qu.:0.18863
14 : 1 P10 : 8 VR07 : 4 Max. :0.27110
(Other):58 (Other):16 (Other):40
mass.sem itd.mean itd.sem SBPV.SQ.mean SBPV.SQ.se
Min. :0.00418 Min. :4.650 Min. :0.06751 Min. : 1.475 Min. : 0.2050
1st Qu.:0.01305 1st Qu.:5.066 1st Qu.:0.14619 1st Qu.: 5.317 1st Qu.: 0.6325
Median :0.01578 Median :5.319 Median :0.17602 Median : 6.553 Median : 2.3139
Mean :0.01686 Mean :5.337 Mean :0.18367 Mean : 28.125 Mean : 5.0554
3rd Qu.:0.01914 3rd Qu.:5.596 3rd Qu.:0.20829 3rd Qu.: 22.837 3rd Qu.: 9.0162
Max. :0.03553 Max. :6.306 Max. :0.34589 Max. :121.000 Max. :14.1775
NA's :58 NA's :58
SBPV.detected.SBPV.detected ABPV.SQ.mean ABPV.SQ.se
Min. :0.00000 Min. : 11.8 Min. : 4.91
1st Qu.:0.00000 1st Qu.: 19.5 1st Qu.: 10.81
Median :0.00000 Median : 33.1 Median : 19.55
Mean :0.09375 Mean : 78668.0 Mean : 44832.98
3rd Qu.:0.00000 3rd Qu.: 242.4 3rd Qu.: 137.59
Max. :1.00000 Max. :393033.3 Max. :223992.04
NA's :59 NA's :59
ABPV.detected.ABPV.detected SBV.SQ.mean SBV.SQ.se SBV.detected.SBV.detected
Min. :0.000000 Min. : 3.885 Min. : 0.095 Min. :0.00000
1st Qu.:0.000000 1st Qu.: 6.848 1st Qu.: 1.917 1st Qu.:0.00000
Median :0.000000 Median : 8.710 Median : 5.082 Median :0.00000
Mean :0.078125 Mean : 34.028 Mean : 23.240 Mean :0.15625
3rd Qu.:0.000000 3rd Qu.: 31.654 3rd Qu.: 22.671 3rd Qu.:0.00000
Max. :1.000000 Max. :150.550 Max. :119.450 Max. :1.00000
NA's :54 NA's :54
box.nested hive.nested
Min. : 1.00 Min. : 1.00
1st Qu.: 8.75 1st Qu.:16.75
Median :16.50 Median :32.50
Mean :16.50 Mean :32.50
3rd Qu.:24.25 3rd Qu.:48.25
Max. :32.00 Max. :64.00
I have been looking for any model that can deal with the zeros and the high concentrations but not really found anything, although I expect that is a common problem. I came across a zero inflated model but I was recommended not to use it because it was for count data and the relatively high SQ values (response variable) of the viruses (ABPV, SBPV, SBV) would not go well with this. I was also told not to use rank tests because of the high number of ties (zeros).
Does anyone have a suggestion what I could use instead?
Thanks in advance.
N.B. I still have to determine bacteria concentrations or SQ values. Afterwards I will be told which fields were treated with the insecticide. For purposes of figuring out how to do this kind of analysis in R I created a vector pseudotreatment, which will be replaced by a vector specifying the real treatment information later on. Pseudotreatment has 2 levels C for control N stands for the insecticide.