Getting the most information from your models: 6 keys to model selection

Today’s post is a guest blog by Shawn Leroux. Shawn is a postdoctoral fellow at the University of Ottawa and he’s going to write about model selection. Model selection techniques, and in particular Akaike Information Criteria, consider the trade-off between data-fitting and involving too many parameters – exactly the types of considerations that go into choosing a model that is Just Simple Enough. Take it away, Shawn!


I just got back from attending a model selection workshop delivered by Dr. David Anderson (yes, from Burham & Anderson 2002) and organized by the Quebec Center for Forest Research. I have been using the information-theoretic approach for several years but Dr. Anderson provided some  insight that I thought would be useful to share. Below are the six key messages I am taking home from the workshop.

  1. Think, think, think. Seems obvious but it can not be over-emphasized. Model selection is useless if the candidate set of models are not carefully chosen and represented with appropriate mathematical models. Candidate models should be justified and derived a priori.
  2. Adjusted R2 should not be used for model selection. Burham & Anderson (2002; p. 94-96) present convincing evidence for this. Below is a subset of their Table. 2.1.The table presents two (of nine) a priori models of avian species-accumulation curves from the Breeding Bird Survey (from Flather 1996). The table includes model, number of parameters (K), delta AIC, Akaike weights and adjusted R2. If we simply consider the adjusted R2 we conclude that both models are excellent fits to the data. However, model selection based on AIC shows that Model 1 is poor relative to Model 2. In fact, the evidence ratio (see pt 3 below) for Model 2 vs Model 1 is 3.0 x 1035! There is little model selection uncertainty (see pt 4 below) in this two model set. A quick look at the residuals or plots of observed vs predicted values for both models helps to understand why adjusted R2 can be misleading for model selection.
  3. Use evidence ratios. Evidence ratios are a concise way to quantify model selection uncertainty (see pt 4 below) and the weight of evidence for each model. A quick approximation for the evidence ratio of the Best Model (Model with highest Akaike weight) vs Model 2 is given by Akaike weight of Best Model/Akaike weight of Model 2. More formally,
  4. Akaike weights provide a measure of model selection uncertainty. An Akaike weight is the “probability that model i is the actual (fitted) Kullback-Leibler best model in the set” (Anderson 2008, p. xix). We have high model selection uncertainty if more than a couple models in our set have some Akaike weight (e.g. Akaike weights for five models in a five model candidate set are 0.4, 0.2, 0.175, 0.125, 0.1). We have low model selection uncertainty if all or most of the weight lies in one model (e.g. Akaike weights for five models in a five model candidate set are 0.95, 0.0, 0.05, 0.0, 0.0). Model averaging (see pt 5 below) should be done if you have high model selection uncertainty.
  5. Use multimodel inference. Usually we make inference from the estimated best model in our candidate set of models. However, if we thought hard about which models to include in our a priori set and we have some model selection uncertainty (see pt 4 above), then many models may include precious information that can be used. If we use only the estimated best model, we may be throwing away useful information contained in other models in our set. Multimodel inference allows us to gain information from all models in our set. Model averaging predictions and model averaging parameters within models are two useful methods for multimodel inference. Model averaging for prediction is a weighted average of the predictions (Y) from each of the n models (Anderson 2008, p. 108):where ω is the Akaike weight of model i. Model averaging parameters within models is done similarly except we average parameter estimates instead of model predictions.
  6. Beware of pretending variables! Pretending variables occur when an unrelated variable enters the model set with a ΔAIC ~ 2, therefore causing us to consider this model to be a “good” model. We know we have a pretending variable when adding this variable does not change the deviance. If the deviance has not changed, we have not improved the fit of the model with the added parameter. Parameter estimates and confidence intervals around these estimates should be investigated to confirm the presence of a pretending variable. Pretending variable parameter estimates are usually ~ 0 with large confidence intervals. Pretending variables can skew Akaike weights by increasing model selection uncertainty and may bias multimodel inference, so they should be removed from a candidate model set.

Thanks to Dr. Anderson for leading this workshop and to Marc Mazerolle at CEF for organizing it.


Anderson, D.R. 2008. Model based inference in the life sciences: A primer on evidence. Springer, New York.

Burham, K.P. & Anderson, D.R. 2002. Model selection and multimodel inference: A practical information-theoretic approach, 2nd Ed. Springer, New York.

Flather, C.H. 1996. Fitting species-accumulation functions and assessing regional land use impacts on avian diversity. Journal of Biogeography 23: 155-168.