[This post is frequently revised]

In the previous post, after defining a mathematical model, I then asked how and why do we make models. The theme of how we make models was addressed in the previous post, but I did not yet discuss why we make mathematical models.

To examine why we make mathematical models, firstly, since we are considering mathematical models in biology, our model should address a biological question – or at least link to a biological question that has been posed previously, or will be addressed subsequently. In additional to having biological, and therefore, practical relevance here are some more reasons to make a mathematical model:

- To make a quantitative prediction:
*Next year there will be seven frogs.* - To make a qualitative prediction:
*Next year there will be fewer frogs.* - To use information from one scale to understand another (i.e., multiscale modelling):
*Given our knowledge of how individual frogs reproduce and disperse, we use the mathematical model to predict the future species range of the frog population.* - To derive indirect methods to estimate parameters (or other quantities).
- To clarify the reasoning of a given argument:
*The frog population in Frog Lake is growing exponentially, however, predator density in Frog Lake has been constant over the last ten years, possibly owing to an extraneous factor. We derive a mathematical model to show that the persistence of the extraneous factor is necessary for the frog population to grow exponentially.* - To investigate hypothetical scenarios (and then make a quantitative or qualitative prediction). Particularly, scenarios on long time scales or large spatial scales for which it may be difficult to collect data or to conduct an experiment.
- To motivate experiments:
*i.e., the experimental manipulation of extraneous factor in Frog Lake.* - To disentangle multiple causation:
*Every year frogs die from either (a) old age, or (b) from Undetectable Frog Disease. The goal of the mathematical model is to show that time series frog mortality data would exhibit telltale signs of the relative contributions from (a) or (b).* - To make an idea or hypothesis precise, to integrate thinking and to think about a problem systematically.
- To inform data needs to best answer a given question.
- To identify the processes or parameters that outcomes are most sensitive to (i.e. using a sensitivity analysis).**
- To determine the necessary requirements for a given relationship:
*Undetectable Frog Disease evolves intermediate virulence only under a convex trade-off between transmission and virulence.* - To characterize all theoretically possible outcomes.***
- To identify common elements from seemingly disparate situations.
- To detect hidden symmetries and patterns.****

What do you think of this list?

As a long range objective of the blog, I would like to come up with an exhaustive list of possible types of motivations for making mathematical models, which I would then like to draw as a Venn diagram (i.e., since 1. and 6. are not mutually exclusive, but 1. and 2. are*).

Note that 1-8 are very related to ‘what is your question?’. It has been said that coming up with a good question is an important component of doing good science, and so understanding the range of possible motivations for wanting to use a mathematical model, therefore, seems central to making smart decisions about how best to direct one’s scientific investigations. It’s a question that’s worth exploring and in the future I will summarize Chapter 1 of A Biologist’s Guide to Mathematical Modelling which also covers this topic.

* In a sense every quantitative prediction is also qualitative, however, from the point of view of characterizing the motivation for a particular modelling project, I think you need to choose one or the other because this will strongly influence the nature of how the project is best carried out.

** Thanks to Prof. Jianhong Wu and Helen Alexander for contributing to the list.

*** 13. is really a more robust version of 6. A bifurcation diagram might be an example of 13. The same is true of 12. and 5. Numbers 9. and 5. seem similar to me, except that 5. is focused on identifying an explanation consistent with a relationship, while 12. just seeks to be precise (without being motivated by a relationship that needs explaining).

**** See this post for the relevant citations.

nice post. how about, “to make an idea/hypothesis precise”? this would be closely related to #5 / at least partly overlapping. i think that formulating a model in a mathematical way forces you to state exactly what you are assuming or how you think the system works, without being able to brush over things vaguely in a verbal description. sometimes i don’t realize i don’t understand something until i try to write the equations…

Thanks Helen, that’s a good point. I revised the list now and added that one in.

That’s probably the single most important reason to build a model–force yourself to be precise and explicit about your assumptions, and about what follows from them.

You might be interested in Bill Wimsatt’s list of the many uses of false models, and his discussion of the many ways in which a false model can be useful because of, rather than despite, its falsehood.

http://mechanism.ucsd.edu/teaching/models/Wimsatt.falsemodels.pdf (this book chapter was originally a chapter in Nitecki and Hoffman’s 1987 book, Neutral Models in Biology).

Thanks. I’ll take a look.

Pingback: Must-read blog on the art of modeling « Oikos Blog

Pingback: Making the list, checking it twice | Just Simple Enough: The Art of Mathematical Modelling

I’ve got a reason that doesn’t seem to quite fit into your categories. It’s related to the “virtual ecologist” approach advocated by Zurell et al. 2010. I’d succinctly state the reason as “To explain why empirical observations are not as expected by theory.” In other words: theory predicts X, empirical observations show Y. It may be that theory is wrong. Another possibility, however, is that theory is right, but that X appears to be Y when measured in the way that the empiricists are measuring the phenomenon. By modeling the empirical side of the problem as well – modelling how the empirical measurements are taken, the consequences of the sampling scheme used in measurement, etc. – a model can show how and why X might appear to be Y, and thus resolve the conflict between theory and observation. I’ve got an example of this that I’m working on, but which this margin is too small to contain…

Oh cool, it’s Ben’s Last Theorem; you should call it the BLT! Thanks for the reference, I’ll take a look at some point. I think I consider THE MODEL, in it’s most comprehensive version, to include an explicit statement about how errors occur (i.e., measurement, demographic, environmental) and I think the model should be derived with the data collection procedures in mind. The only example that I can really think of (regarding what you have said) is something like an SI model where if you solve the ODEs you get the number of infected (I) people on each day. However, what is typically reported during an epidemic is the number of new cases each day. These are not the same thing, but it’s straight forward to add another ODE to the system of equations and have this be an output. Perhaps that is similar to what you are saying?

Mmm, I love BLTs. :-> Well, I’d say your example is on the right track. I can’t give the example I’m currently working on, since it’s still in prep. But to go with your example: suppose the SI models in the literature typically predicted one pattern of infections over time, but the empirical pattern observed was quite different. (I don’t know that that is actually the case, but this is hypothetical.) You might hypothesize that the reason had to do with some aspect of how the infection pattern is observed empirically. Let’s say that new infections, due to the nature of the test, can’t be detected until six months in. Additionally, the people measuring infections in the field might be biased in their reports; once it is clear that an outbreak has occurred, their reports are accurate, but prior to that time, they tend to under-report because they attribute symptoms to other causes. Existing SI models, being more concerned with the true pattern of infections, might not have taken these facts into account. To test your hypothesis, you make a new model which is mechanistically the same as prior models, but which adds an explicit observation step that attempts to replicate the observational procedure used in reality, biases and all. Your new model might show that the existing SI models do in fact predict the pattern of infections, once observational issues are accounted for. Yeah?

Yeah, for sure. Another direction you could take this is to ask how should the sampling procedure be structured to maximize your chances of drawing the kind of inference you are interested in. For example, more intensively sampling variables that are a direct result of parameters that the model predictions are known to be sensitive to. Good luck!

Right. Since I’m a theoretician, however, it’s not up to me to determine how the empiricists do their sampling. :-> In the paper I’m working on, this is indeed where we go in our conclusions: we give recommendations to the empiricists regarding how they might sample differently, and analyze their samples differently, in order to better detect the processes of interest. Cheers!

Pingback: Celebrating simple models | Michael McCarthy's Research