Robust statistics explain findings of neonicotinoids field experiment

This week [June 29] we published a paper (Woodcock et al., 2017) in the journal Science, which provides recent evidence for country-specific effects of neonicotinoid pesticides on honeybees and wild bees. These results were generated from a large-scale, multi-country field experiment undertaken across a number commercial farms.

The topic of neonicotinoid use is controversial, therefore the paper has been a focus of much interest and comment by the agrochemical industry, academic community, and media alike. Here, I outline why the CEH study was statistically robust for those interested in the statistical validity of this peer-reviewed research.

Summary

  • It is important when designing experimental studies that statistical power is as high as possible.
  • Including Country in the statistical model accounts for a number of variables
  • Initial colony strength and Landscape are not confounding variables
  • Our wild bee findings are valid and account for Country differences and natural reproductive variability in the population

Statistical Power is important

To draw conclusions from an experiment’s statistical tests, it is necessary to understand the ‘statistical power’ of the test in question. Statistical power is the probability of detecting an effect, if it exists. If this power is low, then you may not be able to detect an effect despite there actually being one present. In fact, if power is lower than 50% then you would be more likely to fail to detect a true effect and may report no significant differences detected.

It is important when designing experimental studies that statistical power is as high as possible.

Typically, sample size is the biggest factor in determining statistical power, with higher sample sizes providing greater statistical power. In this study, a comprehensive power analysis was undertaken before any data were collected to determine the adequate sample size and optimal design of the study to be able to detect, with sufficient power, any potential effects and the methodology we took was peer-reviewed in the Journal of Applied Ecology (Woodcock 2016). A robust inference could be drawn from the analysis by pooling data obtained across countries providing sufficient sample sizes for adequate power, but accounting for Country in the statictical analysis. Conducting separate analyses for individual countries reduces the sample size used in each statistical test, thus reducing power and reducing an ability to detect significant effects that may be present.  

Because of this, those looking to base decisions on third-party reanalyses of our data should be cautious where separate analyses have been undertaken or data disaggregated among countries because the statistical power is reduced, increasing the likelihood of failing to detect a true effect.

The importance of Country in the statistical model

It is important to capture any between-country variation within the model and ensure no confounding between different estimated components of the model affect the results, including any potential effects of neonicotinoid exposure. Importantly our analysis included Country in the model to account for any differences and ensure confounding variables do not affect the results.   

For example, we show below (Figure 1), that honeybee colony strength (i.e. worker bee number) was found not to between the treatments pre-exposure (significant tests confirm this p=0.95). The boxplot illustrates that it would have been inappropriate to include colony strength in the model and doing so would have reduced the statistical power and therefore the ability to detect an effect.

Figure 1: Number of honeybee workers pre-exposure by treatment*country

Boxplot of number of honeybee workers pre-exposure by treatment*country

Likewise, our tests showed that landscape did not confound the statistical model and that Country adequately captured this site-specific variation. To illustrate the point, the differences between landscape structure (defined here as Shannon diversity), see Figure 2, that landscape structure is the same across the treatment groups (control and the two neonicotinoid compounds) but differed by country, see Figure 3, with the UK demonstrating higher values. 

Figure 2: Boxplot of Shannon diversity index*Treatment

Boxplot of Shannon diversity index*Treatment

Figure 3: Boxplot of Shannon diversity index*Country

Boxplot of Shannon diversity index*Country

Statistics and wild bee findings

Some aspects of the study have been misunderstood in relation to our analysis of the effects of neonicotinoid nest residue level on reproductive success of bumble bees: these are worth exploring further. We do however agree that reproductive success of wild bee populations is variable. This field study was designed to allow for such real-world variability. Environmental variability such as this would have only decreased our ability to detect effects of neonicotinoid treatments. Nevertheless, we found a statistically significant association between nest residue levels and queen production.

It has been mistakenly suggested that CEH did not take account of 'Country' in the regression analysis of reproductive success in wild bees in response to neonicotinoid residues. This is accounted for, and is clearly stated, in the legend of the paper's Figure 3.

CEH maintains that the correlation of neonicotinoid nest residue level on reproductive success of bumble bees presented in the peer-reviewed paper, is valid.

We do agree with comments made by others, that the reproductive success of wild bee populations is variable, hence why the field study was designed to allow for such real-world variability (as we have done for initial colony strength and landscape differences as described above). Environmental or species variability such as this would have only decreased our ability to detect effects of neonicotinoid treatments. Nevertheless, we found a statistically significant negative association between increasing levels of neonicotinoid residues in wild bee nests levels and decreasing queen production and egg cell count in Bumble bees and the solitary bee, Osmia bicornis respectively.

... and in conclusion

The analyses conducted by CEH ensured that each model used to derive inference was appropriate and met all statistical assumptions made. The peer-review process for Science included scrutiny of our statistical approaches, ensuring these met the highest possible standard. Hence, all results derived and published are statistically robust.

The document is available from http://science.sciencemag.org/cgi/doi/10.1126/science.aaa1190

Science areas: 

Issues: