My recent thinking has been shaped by my peripheral involvement in discussions between colleagues at the University of Ottawa. What I will discuss now, is the confluence of ideas expressed by several people, and I say this because these have been stimulating discussions, and I don’t want to appear as taking full credit for these ideas by virtue of this solo author blog post.
All other things being equal, mechanistic models are more powerful since they tell you about the underlying processes driving patterns. They are more likely to work correctly when extrapolating beyond the observed conditions.
-Bolker (2008) Ecological models and Data in R, p7.
My question for today is:
Given a mechanistic model and a phenomenological (or statistical) model, if we are trying to determine which model is best, shouldn’t the mechanistic model score some ‘points’ by virtue of it being mechanistic?
Assume a data set that both models are intended to describe. Define mechanistic and phenomenological as follows,
Mechanistic model: a hypothesized relationship between the variables in the data set where the nature of the relationship is specified in terms of the biological processes that are thought to have given rise to the data. The parameters in the mechanistic model all have biological definitions and so they can be measured independently of the data set referenced above.
Phenomenological/Statistical model: a hypothesized relationship between the variables in the data set, where the relationship seeks only to best describe the data.
These definitions are taken from the Ecological Detective by Ray Hilborn and Marc Mangel. Here are some additional comments from Hilborn and Mangel:
A statistical model foregoes any attempt to explain why the variables interact the way they do, and simply attempts to describe the relationship, with the assumption that the relationship extends past the measured values. Regression models are the standard form of such descriptions, and Peters (1991) argued that the only predictive models in ecology should be statistical ones; we consider this an overly narrow viewpoint.
Having defined mechanistic and phenomenological, now the final piece of the puzzle is to define ‘best’. Conventional wisdom is that mechanistic models facilitate a biological understanding, however, I think that’s only one step removed from prediction – you want to take your new found understanding and do something with it, specifically make a prediction and test it. Therefore, the goal of both mechanistic and phenomenological models are to predict and the performance of the models in this respect is referred to as model validation.
But, validation data is not always available. One reason is that if the models predict into the future, we will have to wait until the validation data appears. The other reason is that if we don’t have to wait, it’s a bit tempting to take a sneak peek at the validation data. For both model types, you want to present a model that is good given all the information available – it’s tough to write a paper where your conclusion is that your model is poor when the apparent poorness of the model can be ‘fixed’ by using the validation data to calibrate/parameterize the model (which then leaves no data to validate the model, something that, if anything, is a relief because your previous try at model validation didn’t go so well).
In absence of any validation data, one way to select the best model is using Akaike Information Criterion (AIC) (or a similar test). AIC will choose a model that fits the data well without involving too many parameters, but does AIC tell me which model is best, given my above definition of best, when comparing a mechanistic and a statistical model? Earlier this week, I said that if we wanted to settle this – which is better mechanistic or phenomenological – then we could settle it in the ring with an AIC battle-to-the-death.
As the one who was championing the mechanistic approach, I now feel like I didn’t quite think that one through. Of the set of all models that are phenomenological versus the set of all models that are mechanistic (with respect to a particular data set), it’s not rocket science to figure out which set is a subset of the other one. If one model is a relationship that comes with a biological explanation too, then you’re getting something extra than the model that just describes a relationship. Shouldn’t I get some points for that? Didn’t I earn that when I took the mechanistic approach to modelling because my options for candidate models is much more limited?
There is one way that mechanistic models are already getting points from AIC. If I did a good job of parameterizing my mechanistic model there should be few fitted parameters – hopefully even none. But is that enough of an advantage? Exactly what advantage do I want? I think what I am hoping for is related to the span of data sets that the model could then be applied to for prediction or validation. I feel pretty confident taking my mechanistic model off to another setting and testing it out, but if my model was purely statistical I might be less confident in doing so. Possibly because if my mechanistic model failed in the new setting I could say ‘what went wrong?’ (in terms of my process-based assumptions) and I’d have a starting point for revising my model. If my statistical model didn’t do so well in the new setting, I might not have much to go on if I wanted to try and figure out why.
But, if the objective is only to predict then you don’t need to know about mechanisms and so the phenomenological/statistical approach is the most direct and arguably best way of generating a good predictive model. Perhaps, what this issue revolves around is that mechanistic models make general and inaccurate predictions (i.e., the predictions might apply to a number of different settings) and that phenomenological models make accurate, narrow predictions.
Truth be known, this issue is tugging at my faith (mechanistic models), and I’m not really happy with my answers to some of the fundamental questions about why I favour the mechanistic approach, as I do. And let me say, too, that I definitely don’t think that mechanistic models are better than phenomenological models; I think that each have their place and I’m just wondering about which places those are.