Report from the 4th Annual AARMS Mathematical Biology Workshop

On the weekend, I attended the 4th Annual AARMS Mathematical Biology Workshop hosted at Dalhousie University in Halifax. Workshop participants were a mix of graduate students, postdocs and professors originating from Memorial University of Newfoundland, Dalhousie University, the University of New Brunswick, York University, Wilfred Laurier University and the College of William and Mary. The research presented covered a wide range of model types including partial differential equations, ordinary differential equations, time delays, stochastic processes, algorithms to build maximum agreement forests, and power laws. The applications of these models covered a variety of biological situations including cellular processes, the immune system, phylogenetics, epidemiology, ecology, and ecosystem models for marine systems.

For me, this meeting was a great way to meet other local researchers who have similar interests in mathematical biology. In identifying common interests, I think this helps in understanding good candidate topics for future workshops and summer schools, and to know where we can go for advice on different aspects of our research. For graduate students, I hope that these types of initiatives help to provide a breadth of exposure to better understand what type of research you like, and to get ideas for where you might like to take your research in the future.

… and Halifax was beautiful! Thanks to David Iron, the other workshop organizers, and to AARMS for providing funds to help support student travel.

Advertisements

On the art of ecological modelling

An article by Ian Boyd from this week’s Science argues that there is a systemic problem of researchers under acknowledging the limits of model prediction:

Many of today’s ecological models have excessively optimistic ambitions to predict ecosystem and population dynamics.

And:

The main models are general population models (16) and data-driven, heuristic, ecosystem models (17,18), which are rarely validated and often overparameterized.

The good news is that:

Some recent studies (35)—including that by Mougi and Kondoh (6) on page 349 of this issue—help to specify where the limits of prediction may lie.

Initially, I thought that these articles (3-6) might be the answer that I’ve been searching for, but it seems that Dai et al. (2012), Allesina and Tsang (2012; see also here) and Mougi and Kondoh (2012) are examples of well derived models for specific questions, not rules for deciding how complex is too complex for a general range of ecological questions. Liu et al. (2011) is a general result, but asks a different question in speaking to the difficulty of controlling real complex systems.

Ian Boyd’s article raises more questions than answers, but it draws attention to an important question, which is highly worthwhile in-of-itself.

Do we need to derive mechanistic error distributions for deterministic models?

In fitting mathematical models to empirical data, one challenge is that deterministic models make exact predictions and empirical observations usually do not match up perfectly. Changing a parameter value may reduce the distance between the model predictions and some of the data points, but increase the distance to others. To estimate the best-fit model parameters, it is necessary to assign a probability to deviations of a given magnitude. The function that assigns these probabilities is called the error distribution. In this post, I ask:

Do mechanistic, deterministic mathematical models necessarily have to have error distributions that are mechanistically derived?

One of the simplest approaches to model fitting is to use a probability density function, such as the normal or Poisson distribution, for the error function and to use a numerical maximization algorithm to identify the best-fit parameters. The more parameters there are to estimate the more time consuming this numerical search becomes, but in most cases this approach to parameter estimation is successful.

In biology, the processes that give rise to deviations between model predictions and data are measurement error, process error, or both. Some simple definitions are:

  • Measurement error: y(t) = f(x(t), b) + e
  • Process (or demographic) error: y(t) = f(x(t)+e, b)

where x(t) is a variable, such as the population size at time t, b is a vector of parameters, f(x,b) is the solution to the deterministic model, e is the error as generated by a specified probability density function, and y(t) is the model prediction including the error. As examples, counting the number of blue ducks each year might be subject to measurement error if a major source of error is in correctly identifying the colour of the duck, whereas extreme weather events that affect duckling survivorship are a source of process error.

In the simple approach described above, to keep it simple, I intended to implement the measurement error formulation of the full model. Under this formulation, many of the probability density functions that might be chosen as the error distribution have a process-based interpretation. For example, the normal distribution arises if (1) there are many different types of measurement errors, (2) these errors arise from the same distribution, and (3) total measurement error is the sum of all the errors. In biological data, all of that might be true, to some degree, but in general this explanation is likely incomplete.

A second justification of the simple approach, could be that the error distribution is not intended to be mechanistic, and here, the normal distribution is simply a function that embodies the necessary characteristics – it’s a decreasing function of the absolute value of the deviation. But if you have derived a mechanistic deterministic model, is it really okay to have an error distribution that isn’t justified on mechanistic grounds? Does such an error distribution undermine the mechanistic model formulation to the point where you might as well have started with a more heuristic formulation of the whole model? Would this be called semi-mechanistic – if the model is mechanistic, but the error distribution is heuristic?

If this all seems like no big deal, consider that measurement error does not compound under the effect of the deterministic model, while process error does. When only measurement error operates the processes occur as hypothesized and only the measurements are off. When process error occurs – slightly higher duck mortality than average – there are fewer breeding ducks in the next year, and this change feeds back into the process affecting the predictions made for future years. This makes model fitting to y(t) quite difficult. This is because model fitting is easier when the model and the error can be separated so that numerical methods for solving deterministic models can be used. If the error and the model can not be disentangled then fitting to y(t) will usually involve solving a stochastic model of some sort, which is more difficult, and more time consuming.

An easier alternative for the process error formulation, is to fit using gradient matching. This is because deterministic models are usually differential equations, f'(t) = g(x(t),b). Let z(t) be a transformation of the data, such that, z(t) = [y(t+Δt)-y(t)]/Δt, then we can fit the model as z(t) = g(x(t),b) +e1 where e1 are deviations between the empirical estimate of the gradient and the gradient as predicted by the model. Derivations from the model predicted gradient can be viewed as errors in the model formation or error that arises due to variation in the processes described by the model. If we have a mixture of measurement error and process error then we could do something nice like generalized profiling.

Anyway, this all has been my long-winded response to a couple of great posts about error at Theoretical Ecology by Florian Hartig. I wanted to revisit Florian’s question ‘what is error?’ Is error stochasticity? The latter would mean that e is a random variable, and I have a hard time imagining any good reason why e would not be a random variable. However, I think there are more issues to resolve if we want to understand how to define error. Specifically, how do we decide which processes are subsumed under f(x(t)) and which go under e? Is this a judgment call or should all the deterministic processes be part of f(x(t),b) and all the stochastic processes be put into e and therefore be considered error?

Who’s ready to Rock?

Several faculty at MUN have photos of icebergs on their websites. Luckily, I was able to get a jump start on this during my recent visit.

In Atlantic Canada, which university has:

  1. The largest undergraduate enrollment,
  2. The most research funding, and
  3. A non-zero abundance of icebergs?

If you said Memorial University (MUN), then you win. You’ll find it all in St. John’s, Newfoundland (a.k.a. The Rock).

If you, or someone you know, are interested in graduate studies at MUN, please visit my website.

Breakthrough mathematics, fundamental research and ideas

During my time on the train last week, I read some of the book ‘God created the integers: the mathematical breakthroughs that changed history’ by Stephen Hawking and several free hotel newspapers: the Globe and Mail, the Toronto Star and the National Post. This served as a supplement to my general musings on how to be more imaginative in my research and the innovation agenda.

The book title is based on a quote by the nineteenth century mathematician, Leopold Kronecker; in full, ‘God created the integers. All the rest is the work of Man.’ The quote speaks to the fact that modern mathematics is a magnificent outgrowth of the most humble beginnings: the whole numbers. The book starts with quoting Euclid as writing that “The Pythagoreans… having been brought up in the study of mathematics, thought that things are numbers… and the whole cosmos is a scale and a number.” In the first chapter, what caught my interest was the Pythagorean cult and their treatment of mathematical results such as the square root of 2 being irrational:

 The Pythagoreans carefully guarded this great discovery (irrational numbers) because it created a crisis that reached to the very roots of their cosmology. When the Pythagoreans learned that one of their members had divulged the secret to someone outside their circle they quickly made plans to throw the betrayer overboard and drown him while on the high seas. -p3

Next, I read the Intellectual Property supplement to the National Post, and in reading about intellectual property, I noted that priority to developing new technologies such as Google Glasses* are protected by patents, yet throwing people overboard to protect new advances in fundamental research is no longer appropriate. In fact, amongst scientists insights and new results are freely shared. Arguably, as a consequence, advances in fundamental research then have no market value – if they are keenly given away to anyone, of any company, of any country (or so my reasoning goes).

Back to the book: the next chapters covered Archimedes, Diophantus, Rene Descartes, Isaac Newton and Leonhard Euler. Despite making advances in fundament research, some of these mathematicans also worked on very applied projects: Archimedes on identifying counterfeit coins and Euler on numerous projects including how to set up ship masts, on correcting the level of the Finow canal, in advising the government on pensions, annunities and insurance, in supervising work on a plumbing system, and on the Seven Bridges of Konigsberg Problem. With regard to the Seven Bridges of Konigsberg Problem,

Euler quickly realized he could solve the problem of the bridges simply by enumerating all possible walks that crossed bridges no more than once. However, that approach held no interest to him. Instead, he generalized the problem… – p388.

On the shoulders of Giant's - perhaps (perhaps necessary but not sufficient). Irrespective of the boost: uncommonly brilliant and arguably unmatched. The photo is sourced from Andrew Dunn (http://www.andrewdunnphoto.com/)

… and perhaps that quote speaks to the tension in advancing applied research at the expense of fundamental research.

In reading the book, so far I’m most impressed by Newton**. How on earth did he think of that? By studying pendulums on earth he arrives at a mechanistic model of planetary motion? Swinging pendulums and falling apples? Swinging and thudding? This doesn’t naturally evoke ideas of elliptical motion for me, let alone that these events over such small distances are generalizable to a cosmic scale. Setting that aside, and continuing to generalize: every object I have ever pushed has… stopped. Yet, Newton’s first law, when it comes to objects in motion, earthly observations are the exception to the rule (not generalizable) and it takes an extra twist (external forces) to explain why, on earth, things always stop. Generalize for the universal theory of gravity; don’t generalize for the first law. I find it so not-obvious! And consequently, I’m so very impressed.

Footnotes

*Google is amazing **and Newton, much moreso.

How to not make bad models

Levin (1980)* is a concise and insightful discussion of where mathematical modelling can go wrong. It is quite relevant to my investigation of The Art of Mathematical Modelling and does a nice job of addressing my ‘why make models?’ question.

Vito Volterra is referred to as the father of mathematical biology in Levin (1980).

This paper answered one of the questions that I had long been wondering about: who is considered to be the father of mathematical biology? Levin’s answer is Vito Volterra** – at least for mathematical biologists who come from a mathematical background. Levin then says that modern day mathematical biologists, as the descendents of Vito Volterra, lack his imagination; too often investigating special cases or making only small extensions of existing theory. It’s a fair point, but thinking takes time and time is often in short supply. My take on Levin’s comment is ‘aspire to be imaginative, but to remember to be productive too’. Furthermore, Levin identifies one of the ingredients that make great models great: imagination – I’m adding that to my notes.

A second piece of advice is that mathematical models that make qualitative predictions are more valuable than those that make quantitative predictions. Levin’s reasoning is that ‘mathematical ecology as a predictive science is weaker than as an explanatory science because ecological principles are usually empirical generalizations that sit uneasily as axioms.’ That is quite eloquent – but is it really quite that simple? For example, if you make a quantitative prediction with a stated level of confidence (i.e., error bars) is that really that much worse than making a qualitative prediction? The sentiment of the quote appears to be to not overstate the exactness of the conclusions, but to me this seems equally applicable to quantitative or qualitative models.

Levin coins the phrase ‘mathematics dressed up as biology’. I have my own version of that, as I like to say ‘that’s just math and a story’, in both cases, for use whenever there are weak links between any empirical observations and the model structure.

To conclude, this paper discusses why the different approaches of biologists and mathematicians to problem solving can result in mathematicians that are keen to analyze awkwardly derived models and in biologists who lack an appreciation for the mathematician’s take on a cleanly formulated problem. Rather than discussing what makes great models great, Levin’s paper reads like advice on how not to make bad models, and because it’s so hard to distill the essence of good models, looking at the art of mathematical modelling from that angle is a constructive line of inquiry.

References

Levin (1980), Mathematics, ecology and ornithology. Auk 74: 422-425

Footnotes

*Suggested by lowendtheory, see Crowdsourcing from Oikos blog.

**Do you agree? For me, if this is true then the timing is interesting: Vito Volterra (1926), Ronald Ross (1908), Michaelis-Menten (1913), P.F. Verhulst (1838), JBS Haldane (1924) and the Law of Mass Action dates to 1864.

Levin also hits on several items from my ‘why make models’ list and so I have updated that post.

Interpreting probabilities

Twice as a student my professors off-handedly remarked that the parameterization of probabilistic models for real world situations lacked a sound philosophical basis. The first time I heard it, I figured if I ignored it maybe it would go away. Or perhaps I had misheard. The second time it came up, I made a mental note that I should revisit this at a later date. Let’s do this now.

The question is how should we interpret a probability. So for example, if I want to estimate the probability that a coin will land heads on a single toss how should I construct the experiment? My professors had said that there was no non-circular real world interpretation of what a probability is. At the time, this bothered me because I think of distributions like the Binomial distribution as the simplest types of mathematical models; the mathematical models with the best predictive abilities and with the most reasonable assumptions. Models in mathematical biology, on the other hand, are usually quite intricate with assumptions that are a lot less tractable. My thinking was that if it was impossible to estimate the probability that a coin lands heads on solid philosophical grounds then there was no hope for me, trying to estimate parameters for mathematical models in biology.

Upon further investigation, now I’m not so sure. Below I provide Elliot Sober’s discussion of some of the different interpretations of probabilities (p.61-70).

1. The relative frequency interpretation. A probability can be interpreted in terms of how often the event happens within a population of events, i.e., a coin that has a 0.5 probability of landing heads on a single toss will yield 50 heads on 100 tosses.

My view: This interpretation is not good because it’s not precise enough: a fair coin might very well not yield 50 heads on 100 tosses.

2. Subjective interpretation. A probability describes the ‘degree of belief that a certain character is true’, i.e., the probability describes the degree of belief we have that the coin will land heads before we toss it.

My view: conceptually, regarding how we interpret probabilities with respect to future events, this is a useful interpretation, but this is not a ‘real world’ interpretation and it doesn’t offer any insight into how to estimate probabilities.

3. Hypothetical relative frequency interpretation. The definition of the probability, p, is,

Pr(|f-p|>ε)=0 in the limit as the number of trials, n, goes to infinity for all ε>0,

where f is the proportion of successes for n trials. Sober says this definition is circular because a probability is defined in terms of a probability converging to 0.

My view: This is a helpful conceptual interpretation of what a probability is, but again it’s unworkable as a real world definition because it requires an infinite number of trials.

4. Propensity interpretation. Characteristics of the object can be interpreted as translating into probabilities. For example, if the coin has equally balanced mass then it will land heads with probability 0.5. Sober says that this interpretation lacks generality and that ‘propensity’ is just a renaming of the concept of probability and so this isn’t a helpful advance.

My view: This is a helpful real world definition as long as we are able to produce a mechanistic description that can be recast in terms of the probability we are trying to estimate.

So far I don’t see too much wrong with 2-4 and I still think that I can estimate probabilities from data. Perhaps the issue is that Sober wants to understand what a probability is and I just want to estimate a probability from data; our goals are different.

I would go about my task of parameter estimation using maximum likelihood. The likelihood function will tell me the how likely it is likelihood that a parameter (which could be a probability) is equal to a particular value given the data. The likelihood isn’t a probability, but I can generate confidence intervals for my parameter estimates given the data, and similarly, I could generate estimates of the probabilities for different estimates of the parameter. In terms of Sober’s question, understanding what a probability is, I now have a probability of a probability, and so maybe I’m no further ahead (this is the circularity mentioned in 3.). However, for estimating my parameter this is not an issue: I have a parameter estimate (this is a probability) and a confidence interval (that was generated by a probability density).

Maybe… but I’m becoming less convinced that there really is a circularity in 3 in terms of understanding what a probability is. I think f(x)=f(x) is a circular definition, but f(f(x)) just requires applying the function twice. It’s a nested definition, not a circular definition. So which is this?

Word for word, this is Sober’s definition:

P(the coin lands heads | the coin is tossed) = 0.5 if, and only if, P(the frequency of heads = 0.5 ± ε | the coin is tossed n times) = 1 in the limit as n goes to infinity,

which he then says is circular because ‘the probability concept appears on both sides of the if-and-only-if’. It is the same probability concept, but strictly speaking, the probabilities on either side refer to different events and so while that might not work to understand the concept of probability, that definition is helpful for estimating probabilities from relative frequencies if we can only work around the issue of not being able to conduct an infinite number of trials. But for me, that’s how the likelihood framework helps: given a finite number of trials, for most situations we might be interested in we won’t be able to estimate the parameter with 100% certainty and so we need to apply our understanding of what a probability is a second time to reach our understanding of our parameter estimate.

But is that really a circular definition?

I’m not an expert on this, I just thought it was interesting. Is anyone familiar with these arguments?

References

Sober, E. 2000. Philosophy of biology, 2 ed. Westview Press, USA.