- About Us
- Working with Others
- Research Facilities
- Software
- Statistical Packages
- Publications
- Library Services
- Contact Us
![]() |
| `Assessing the biological quality of fresh waters: RIVPACS and other techniques', edited by John F. Wright, David W. Sutcliffe and Mike T. Furse. Published by the Freshwater Biological Association, Ambleside, June 2000. ISBN 0 900386 62 2. 400 pages. Price £40 softback, £60 hardback (including p. & p.). |
RIVPACS - Model DevelopmentLinking Biological Groups with Environmental Variables Having classified the reference sites into groups on the basis of their macroinvertebrate community composition, the next step is to link the biological grouping of sites to environmental characteristics. How do we select the environmental predictor variables? The occurrence and abundance of macroinvertebrates depends on the availability of suitable habitats, flow regimes and food. However, detailed information on habitats and food availability is difficult to acquire at the spatial scale of river stretch and across such a wide range of sites. In the statistical modelling approaches, such as RIVPACS, the aim is to select a set of environmental predictor variables which can be measured in a standardised way at any type of river site. Therefore variables must be used, many of which are not necessarily causal determinants of the macroinvertebrate community composition, but merely robust correlates, which can be used to predict the expected fauna at any type of site.
Table 1. The predictor variables used in the current version of RIVPACS One of the problems of selecting predictor variables is that one does not want to include variables if their values, which will be used to set the expected fauna, have already been altered by the pollution or environmental stress whose effect one is trying to assess. As a result, variables such as water chemistry, ephemeral macrophyte habitats and recent flow regime may not be appropriate. Thus the most appropriate predictive model may NOT be the one which gives the best possible statistical fit. The ideal long-term aim is to have a single fixed prediction of the site-specific fauna expected at any particular site, based on either time-invariant characteristics, or at least the historical values of parameters measured at the site when it was considered to be of good quality. However, this is not always possible in reality. The current version of RIVPACS uses 11 variables in the prediction model (Table 1). They were selected as the optimal sub-set of variables from an initial list of 28 variables and give the best discrimination of the biological groups, whilst trying to minimise the problems just described. Unbiased estimates of discriminatory ability can be obtained using cross-validation or leave-one-out techniques. The biological groups and environmental variables were linked using the multivariate statistical technique of Multiple Discriminant Analysis (MDA). MDA derives predictive equations, referred to as discriminant axes, which select those aspects of the environmental variation that differ most between the site groups. These predictive equations are then used to estimate to which biological group or groups any site belongs, based on the same variables measuring its environmental characteristics. |
Figure 1. Explanation of Multiple Discriminant Analysis (MDA) The simple example in Figure 1 shows the principle with sites from just two groups (A and B) derived from their biology. It can be seen that in terms of the mean substratum particle size there is considerable overlap in values between the two groups. Similarly in terms of the distance of the sites from their source the two groups are not that distinct. However, if we view the variation in this optimally discriminating diagonal direction as represented by the new derived variable M, we can see that the values for the two groups of sites projected onto this new discriminant axis would not overlap and so we could always predict which group a new site belonged to without any error based on its substratum composition and distance from source. This idea easily extends to more than two groups and to many variables; it just involves more axes which jointly explain all the environmental differences between the groups.
Figure 2. Assigning a new test site to classification groups using probabilities As mentioned previously the assignment of sites to distinct groups does not imply that such groupings actually exist in the environment. The reality is that the sites are situated along continuous environmental gradients. Therefore, on the basis of their environmental attributes, new test sites are not just assigned to their most likely group but rather only assigned probabilistically to the site groups (Figure 2). In Figure 2 the circles represent the variability between sites in each of three biologically derived groups (A, B and C) on the discriminant axes derived from the environmental variables. The groups would really be much more variable and almost certainly overlap in their environmental features. On the basis of its environmental features, the new site is placed at point X. The distance of the site X from a group’s mean is used to calculate its probability of belonging to that group. This test site has a probability of 0.6 of belonging to group A, 0.3 to B, and 0.1 to group C. Typically in RIVPACS, a test site will have a predicted probability of >1% of belonging to between one and five of the 35 TWINSPAN biological groups. Importantly, this MDA approach provides a method of automatically identifying any test sites which are outside the current scope encompassed by the RIVPACS sites. Sites whose environmental characteristics give them a low probability of belonging to any of the RIVPACS sites groups are highlighted automatically with a warning in the RIVPACS output. In extreme cases, the software will even refuse to make a prediction. Apart from the initial idea of using high quality reference sites to set a “target” or expected invertebrate community for a site, this probabilistic assignment of sites to biological groups was perhaps the most important decision in developing RIVPACS. It makes the predictions of the expected community robust to the precise choice of method of classifying the reference sites. |





