A Bayesian framework for model calibration, comparison and analysis: Application to four models for the biogeochemistry of a Norway spruce forest
Highlights
► Four models of forest biogeochemistry were analysed in a Bayesian framework. ► The framework comprises model calibration, model comparison, analysis of model–data mismatch. ► We used data on soil water content and emissions of CO2, N2O and NO in a Norway spruce forest. ► Uncertainties about measurement error, model structure and parameterisation were quantified. ► We found high probabilities of systematic error in CO2 measurements and in model structure.
Introduction
Various recent reviews have assessed the evidence for impacts of environmental change on European forests (Hyvönen et al., 2007, Kahle et al., 2008, Luyssaert et al., 2010). Most studies have focused on changes in growth and carbon balance, but the importance of the interaction with the nitrogen cycle is increasingly recognised (de Vries et al., 2009, Sutton et al., 2008, Van Oijen et al., 2004, Van Oijen et al., 2008a). Research programmes to measure and model emissions of nitrogenous greenhouse gases from European forests and other ecosystems have been set up (Sutton et al., 2007).
The measurement of nitrous oxide (N2O) and nitric oxide (NO) emissions from forest soils is hampered by the large spatial and temporal heterogeneity in the fluxes, and modelling these processes is still limited by availability of data (Kesik et al., 2005). Moreover, the relevant underlying mechanisms have not yet been clarified fully, and large uncertainties are present in both data and models. Available data sets not only suffer from random measurement error, but also from systematic errors associated with the positioning of measurement chambers in the field and their functioning (Butterbach-Bahl et al., 2002, Kroon et al., 2010). When modelling the systems, there is uncertainty about how to represent processes, i.e. model structural uncertainty (de Bruijn et al., 2009). Furthermore, there is uncertainty about environmental drivers and parameter values.
To improve the applicability of models to the analysis of the greenhouse gas balance of forests, these uncertainties need to be quantified and reduced. Probabilistic methods of model–data fusion or data-assimilation have come to the fore in recent years, and offer the prospect of improved data use and uncertainty quantification (Fox et al., 2009, Wang et al., 2009). Because these methods are applications of probability theory, they require all uncertainties – in data, model inputs and model structure – to be expressed in the form of probability distributions. Bayes’ Theorem can then be employed to update the distributions when new information becomes available.
In biogeochemical modelling, most Bayesian applications have focused on parameterisation of individual models, with little attention for systematic errors in data and model structure. Wang et al. (2009) thus concluded, in a recent review on model–data fusion studies for terrestrial ecosystems, that there is a need for “developing an integrated Bayesian framework to study both model and measurement errors systematically”. The work presented here is intended to contribute to that goal.
We propose a framework which requires that multiple models are used in any given study, and which consists of three operations: (1) Bayesian calibration, (2) Bayesian model comparison and (3) analysis of model–data mismatch.
The overarching objective of this paper is to demonstrate that this three-stage framework is an effective tool for the analysis of models in forest biogeochemistry. For that purpose, we used four different published models and one rich data set from the Norway spruce forest in Höglwald, Germany (Kreutzer et al., 2009). Most of the data were on the nitrogen cycle, with long time series of measurements of emissions of N2O and NO, but we also used time series of the carbon and water cycles in the form of soil respiration and soil water content.
Bayesian calibration, i.e. the first operation in the framework, consists of defining a prior probability distribution for a model's parameters and updating that distribution using the data. The method has not often been applied to parameter-rich nonlinear process-based ecosystem models (Luo et al., 2009). One reason is the high computational demand associated with the technique, which is exacerbated by the long running time of the models. A second issue is the difficulty of quantifying uncertainties about random and systematic measurement errors. We show in this paper how both types of error can be accommodated in a Markov Chain Monte Carlo algorithm for Bayesian calibration.
Bayesian model comparison, the second operation used in the framework, aims to determine the extent to which the data support the different models. This is done by providing a probability distribution over models rather than parameter values. The attempt in this paper to assess whether Bayesian model comparison as a method can be useful for model selection purposes is, as far as we are aware, new for parameter-rich process-based ecosystem models.
Detailed analysis of model–data mismatch, the third operation in our framework, is not a common step in Bayesian model studies, which tend to focus on the probabilistic aspects of model behaviour rather than the internal structure of the models (Gelman and Shalizi, 2010). Bayesian calibration and model comparison effectively treat models as black boxes that convert parameter values into outputs, so this further analysis is needed to facilitate model improvement.
In summary, this paper aims to show the strengths and weaknesses of this three-operation Bayesian framework using a case-study with four models simulating the biogeochemistry of a Central European spruce forest.
Section snippets
Data
All data were taken from the Norway spruce (Picea abies L.) site at Höglwald, Germany, latitude 48°30′N, longitude 11°10′E, altitude 540 m (Papen and Butterbach-Bahl, 1999). Trees were planted in 1907. Soil C and N were around 90,000 and 5000 kg ha−1 (Kreutzer et al., 2009, Rothe, 1997). For the years 1985–1995, mean annual temperature was 7.9 °C, precipitation 888 mm, and atmospheric N-deposition as measured in the throughfall 39.4 kg N ha−1 (Rothe, 1997). For 1975–1990, average global radiation was
Bayesian calibration
All four models were calibrated using the same MCMC-algorithm, i.e. Metropolis sampling. Burn-in and convergence were determined visually, by each modelling group separately, but an additional analysis of the Markov chains was carried out to confirm that parameter distributions had properly stabilised. The analysis was based on the fact that, after a chain reaches convergence, subsequent distinct and sufficiently long sub-chains should have similar sample means and variances. We compared the
Bayesian calibration: methodological issues
Bayesian calibration uses data to update the joint probability distribution for a model's parameters. The Bayesian approach allows for non-Gaussian distributions for both parameter uncertainty and measurement error. Our calibration was therefore based on sampling by means of MCMC rather than on matrix inversion methods. This in turn allowed us to include systematic data error in the calibration, rather than having to estimate error terms in a first separate step, as was done for example by
Conclusions
- •
Bayesian calibration can be used to reduce parametric uncertainty of complex dynamic models for forest biogeochemistry.
- •
Bayesian calibration allows for the use of datasets that contain long time series of gas emissions with high intra- and interannual variability, and with both random and systematic error.
- •
Data need to be compared with models at the appropriate temporal scale. This may involve, as shown here, monthly averaging and the calculation of annual frequency distributions. These
Acknowledgements
We thank the European Union for financial support to carry out this work in projects NitroEurope (FP6, GA 017841) and Carbo-Extreme (FP7, GA 226701), and we are grateful to our colleagues in these projects for discussion. J.B.Y. wishes to acknowledge the help of M. Richards in programming the routines for the Bayesian analysis of DAYCENT. We express our thanks to two anonymous reviewers for their constructive comments on the original manuscript.
References (53)
- et al.
Model evaluation of different mechanisms driving freeze-thaw N2O emissions
Agriculture, Ecosystems & Environment
(2009) - et al.
The impact of nitrogen deposition on carbon sequestration by European forests and heathlands
Forest Ecology and Management
(2009) - et al.
The REFLEX project: comparing different algorithms and implementations for the inversion of a terrestrial ecosystem model against eddy covariance data
Agricultural and Forest Meteorology
(2009) - et al.
Bayesian treatment of a chemical mass balance receptor model with multiplicative error structure
Atmospheric Environment
(2009) - et al.
Uncertainties in eddy covariance flux measurements assessed from CH4 and N2O observations
Agricultural and Forest Meteorology
(2010) - et al.
Elicitation of multivariate prior distributions: a nonparametric Bayesian approach
Journal of Statistical Planning and Inference
(2010) - et al.
Simulation of NO and N2O emissions from a spruce forest during a freeze/thaw event using an N-flux submodel from the PnET-N-DNDC model integrated to CoupModel
Ecological Modelling
(2008) - et al.
Challenges in quantifying biosphere-atmosphere exchange of nitrogen species
Environmental Pollution
(2007) - et al.
Bayesian calibration of a model describing carbon, water and heat fluxes for a Swedish boreal forest stand
Ecological Modelling
(2008) - et al.
Heterotrophic soil respiration—comparison of different models describing its temperature dependence
Ecological Modelling
(2008)
A review of applications of model–data fusion to studies of terrestrial carbon fluxes at different scales
Agricultural and Forest Meteorology
Bayesian calibration as a tool for initialising the carbon pools of dynamic soil models
Soil Biology & Biochemistry
Effect of tree distance on N2O and CH4-fluxes from soils in temperate forest ecosystems
Plant and Soil
Marginal likelihood from the Metropolis–Hastings output
Journal of the American Statistical Association
Why environmental scientists are becoming Bayesians
Ecology Letters
Simulated interaction of carbon dynamics and nitrogen trace gas fluxes using the DAYCENT model
The likely impact of elevated [CO2], nitrogen deposition, increased temperature, and management on carbon sequestration in temperate and boreal forest ecosystems. A literature review
New Phytologist
Probability Theory: The Logic of Science
Bayes factors
Journal of the American Statistical Association
Inventories of N2O and NO emissions from European forest soils
Biogeosciences
Bayesian calibration method used to elucidate carbon turnover in forest on drained organic soil
Biogeochemistry
Comparing simulated and measured values using mean squared deviation and its components
Agronomy Journal
The complete nitrogen cycle of an N-saturated spruce forest ecosystem
Plant Biology
A process-oriented model of N2O and NO emissions from forest soils: 1. Model development
Journal of Geophysical Research: Atmospheres
Cited by (66)
Joint estimation of biogeochemical model parameters from multiple experiments: A bayesian approach applied to mercury methylation
2022, Environmental Modelling and SoftwareAnalysis of parameter uncertainty in model simulations of irrigated and rainfed agroecosystems
2020, Environmental Modelling and SoftwareA Bayesian inversion framework to evaluate parameter and predictive inference of a simple soil respiration model in a cool-temperate forest in western Japan
2020, Ecological ModellingCitation Excerpt :Variations in soil respiratory CO2 and the associated uncertainty therein (i.e. predictive uncertainty) can then be inferred. This technique has been thus widely used in the biogeochemical and atmosphere-surface energy exchange studies (Hashimoto et al., 2011; van Oijen et al., 2011; Kim et al., 2014; Minet et al., 2015; Lu et al., 2017; Berryman et al., 2018), while few studies have employed the GL approach with residuals varying in skewness and kurtosis (Schoups and Vrugt, 2010; Elshall et al., 2019). In the present study, we use 2 years’ worth of soil respiratory CO2 flux observations collected from two open-top chambers in a cool-temperate forest in western Japan during the period 2017–2018 (hereafter referred to as the ‘long-term simulation’), to constrain parameter and predictive uncertainty of a soil respiration model based on Bayesian statistics.
An increasing trend in the ratio of transpiration to total terrestrial evapotranspiration in China from 1982 to 2015 caused by greening and warming
2019, Agricultural and Forest Meteorology