Why parsimony?

One question is does there necessarily exist a simple model for a given biological question, the other is, is there a unique model? And taking that one step further, given two models that are equal in all regards except that one is more complex, why should we favour the more simple model? This argument, that we should prefer simpler explanations, is Occam’s razor.

William of Ockham. This picture is attributed to the following source.

Here’s the definition of Occam’s razor from Wikipedia:

It is a principle urging one to select, among competing hypotheses, that which makes the fewest assumptions and thereby offers the simplest explanation of the effect.

In fact, the Wikipedia page on Occam’s razor, for me, made for inspired reading. Here are some of the highlights*:

Justifications for Occam’s razor

  • Aesthetic: nature is simple and simple explanations are more likely to be true.
  • Empirical: You want the signal; you don’t want the noise. A complex model will give you both, e.g. overfitting in statistics.
  • Mathematical: hypotheses that have fewer adjustable parameters will automatically have an enhanced posterior probability because the predictions are sharper (Jeffreys & Berger, 1991)
  • Practical: it is easier to understand simple models.

Alternatives to Occam’s razor

  • Popper (1992): For Popper it can all be cast in the light of falsifiability. We prefer simper theories “because their ecological context is greater” and because they are testable.
  • Elliot Sober: simplicity considerations do not count unless they reflect something more fundamental.**

And yet my initial reaction to the definition of Occam’s razor was that it sounded a bit strange: simple explanations and few assumptions? Yikes, I can give you your simple explanation, but it’s going take a lot of assumptions to get there. I think my confusion could be due to a difference in bookkeeping (and the phrasing ‘simple explanation of the effect‘). In the Occam’s razor definition, you score only assumptions that contribute to the explanation. In biology, if the true explanation consists of n things-that-matter, the theoretician will say that the observation can be reproduced by only considering k < n of those things. Here, biologists are used to scoring the number of assumptions as the number of things that are suspected to matter but that are neglected, i.e. nk. This difference would seem to suggest that, although in biology we do value simplicity, we also value explanations that incorporate known contributing factors over explanations that ignore these. These types of values are reflected in Elliot Sober’s view on Alternatives to Occam’s razor as described above.

However, even given that caveat, I think we still often prefer simple models in biology. Why? Here’s Ben Bolker (p7)*** with some insight:

By making more assumptions, (mechanistic models) allow you to extract more information from your data – with the risk of making the wrong assumptions.

That does kind of sum it up from the data analysis perspective: simple models make a lot of assumptions, but at the end of it you can conclude something concrete. Complex models still make assumptions, but they are a less restrictive type of assumption (i.e., an assumption about how a factor is included rather than an assumption to ignore it). All this flexibility in complex models means that many different parameter combinations can lead to the same outcome: inference is challenging, and parameters are likely to be unidentifiable. Given Wikipedia’s list of different justifications of Occam’s razor this seems to be an example of ‘using the mathematical justification to practical ends’. That is to say, this argument doesn’t seem to fit well into the list of justifications, but elements of the mathematical and the practical justifications are represented. Or perhaps it fits with Popper’s alternative view?

For the theoretical ecologist, another reason that parsimony is often favoured is certainly the practical justification: because simple models are easier to understand.

What do you think? Is parsimony important in biology? And why?


Jeffreys and Berger (1991) Sharpening Ockham’s Razor on a Bayesian Strop. [pdf] Alternatively, if that isn’t satifying this might do the trick:

Quine, W (1966) On simple theories in a complex world. In The Ways of Paradox and Other Essays. Harvard University Press.****


*okay, so maybe the actual highlight for me was learning a new expression. The expression is ‘turtles all the way down’ and the best way to explain it is by using it in a sentence. Here goes: sometimes people say ‘yes, but that’s not really a mechanistic model because you could take this small part of it and make that more mechanistic, and then you could take parts of that and make those more mechanistic.’ And to that, I would say ‘yes, but why bother? It’s just going to be turtles all the way down‘. 

**fundamental = mechanistic, i.e. biological underpinning. This is a quote from Wikipedia and I need to chase down the exact reference for the statement. I have Elliot Sober (200o) Philosophy of Biology but he doesn’t seem to say anything quite this definitive.

***Ben suggests the references:  Levins (1966) The strategy of model building in population biology;  Orzack and Sober (1993) A critical assessment of Levin’s The strategy of model building in population biology; and Levins (1993) A response to Orzack and Sober: Formal Analysis and the fluidity of science. [I’ll read them and let you know.]

****I haven’t read either, I just list the references in case anyone wants to follow up.

This entry was posted in Just simple enough by Amy Hurford. Bookmark the permalink.

About Amy Hurford

I am a theoretical biologist. I became aware of mathematical biology as an undergraduate when I conducted an internet search to learn about the topic. Now, twelve years later, I want to know, what is it that makes great models great? This blog is the chronology of my thoughts as I explore this topic.

10 thoughts on “Why parsimony?

  1. Sober actually used to believe there were reasons to prefer parsimony (at least in the context of maximum parsimony methods for estimating phylogenetic trees), but changed his mind.

    • Yeah, thanks. What’s a good reference on his thoughts? Have you read his Philosophy of Biology book? I was surprised that despite the title it covers only evolution.

      • Hmm, tough to pick, Sober’s written a number of good books. The Nature of Selection is the first one that made a big splash, and set the agenda for much philosophy of biology. It’s actually largely because of that book that “philosophy of biology” has largely been synonymous with “philosophy of evolutionary biology”. Before that book, philosophers of science mostly didn’t pay much attention to any part of biology at all. I believe his recent book on the evidence for evolution might be the one to go to if you want a concise summary of his ideas about parsimony, and scientific evidence more broadly.

  2. Ben Bolker’s quote touches on one of the issues of parsimony, as does your mentioning of identifiability. If one has too many parameters requiring estimation (relative to the information content of the data), the precision of each parameter estimate will be low. As a result, predictions from such a model will be imprecise.

    Alternatively, if we set parameters to unrealistic values, then we will tend to make biased predictions. However, a key point here is that excluding a variable or component from a model can be thought of as setting a parameter to zero – we are making a specific assumption about the parameter rather than assuming we can estimate with sufficient reliability. (and hence set it to a possibly different value).

    Information theoretic methods attempt to balance this trade-off between including too many imprecise parameters and too few important parameters by finding the model that best represents the information content of the data, with a view to trying to identify the model that would best predict a replicate set of data.

    So, to answer your question, parsimony is important in biology. But we need to understand models across a continuum from very simple to more complex, and how and why the models change with complexity.

    By the way, I really like this blog.


    Mick (mickresearch.wordpress.com)

    • Thanks Mick, I appreciate the comment because this ‘why parsimony’ post covered some key ground towards furthering my blog objectives. I definitely never thought the aesthetic argument was that compelling, but, actually, I find all the justifications for parsimony a bit unsatisfying and I am genuinely wondering what other people are thinking on this.

      I like your point about the continuum. I imagine that in some cases setting a parameter equal to zero screws up your inference and sometimes it doesn’t, but then again, I note that it’s not going to ever be possible to truly know how wrong your inference is. I could build a slightly more complex model and see if my assumptions in my simple model are biasing my results, but there’s no guarantee that there’s a monotonic (or any kind of) relationship between model complexity and the level of bias in my inference. And so then without trying as many different versions of the model as I can think of (short of finding the true model), I’m stuck, I think, with no guarantees that my inference is not totally wrong. I suppose this is a question of structural sensitivity – I’m not sure – but it does certainly seem like testing a range of options is better than thinking that just one model is going to do the trick. Thanks for your comment – much appreciated.

      And thanks for bringing all the Aussies over here. Hellooo Australia!

  3. I strongly believe in the value of parsimony in modeling, but the only grounds I have is practicality. It’s a lot easier to find the baloney if you didn’t mix it into a giant pan of chicken vindaloo.

  4. Just some thoughts on “I want to know, what is it that makes great models great?” A tough question to be sure but one approach to defining a “great model” is what do you do with it? A model that simply explain available data but is put to know other purpose has minimal value and really isn’t a “great model”. A model that is used to direct further research is a good model. So a “great model” perhaps should be judged on the value of the follow-up experiments. If those experiments were useful then the model was “great”. Note that the model doesn’t have to be “correct”, indeed correctness of a model is rarely determinable and hence cannot be used to judge a model. The key is that the model lead to a useful follow-on experiment.

    So, here is one “rule of thumb” for judging a “great model”:

    If you can convince a wet-lab biologist (or other scientist) to do an experiment based on the model then the model was a “good model”. If the results of the wet-lab experiment are useful then it was a “great model”. Note that the model could be completely wrong but if it lead to a lab experiment that learned something particularly useful then it was a “great model”.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s