One of the weaknesses of the current Athlete Biological Passport software seems to be an over-emphasis on outlier detection versus level change detection. To illustrate this point consider the Lance Armstrong case:

According to the UCI, Armstrongs blood data never triggered a flag on the software. The software screen was non-positive despite the obvious reticulocyte suppression starting after the Giro and persisting through the end of the Tour de France. The suppression was later calculated to be completely statistically implausible as part of the expert evidence in the USADA decision.

The question of course is why did the software fail?

The answer seems to be in the interaction between level changes, outliers, and the variance (spread in the data). Outliers are basically one off values that are unusually far from the average. Level changes on the other hand, are an unusual number of consecutive values that are above or below the mean but not necessarily unusually far from the average for any given point. Since ouliers are one off points, they have little effect on the overall variance. Level changes on the other hand, given enough points in the cluster, will potentially increase the variance significantly.

What does that mean for the biopassport?

Consider this mock up example I made using some modified code from the MARSS package for the Nile river flow data. The data is useful as an example because it contains both outlier points as well as a level shift (there was a damn built just before 1900). After dividing the flow data by 1000, it looks a lot like a reticulocyte profile with an average just below 1. The code is useful because it uses a Kalman filter which is essentially a special class of Bayesian statistics and so we can get the basic gist of what the ABP software is doing. I further modified the MARSS example code by adding the 99.5% confidence intervals in the later figures as would be used by the ABP software

The figure above shows the fitting of a flat (top) and stochastic (bottom) model to the data. In both cases, the hidden state being modeled is the true reticulocyte count. In the flat model the true value is a flat line that falls on the average and any deviation away from this true state is model and observation error. In the stochastic model, the true value can move randomly in time and the observed value is this movement plus the model and observation error.

The implication for interpreting the observed values is illustrated next when the standard residuals are plotted with the 95% (black) and 99.5% (red) confidence intervals below.

Comparing the two models it can be seen that the level shift inflates the variance of the data when attempting to use a flat model. Essentially, since the hidden state can’t move then the error is by necessity larger to account for error. In the stochastic model, since the hidden state can move the error is much smaller. In terms of picking up outliers, notice that the stochastic model actually does better than the flat model.

So far so good, but the pink elephant (the level change) is still in the room. The solution of course is to take advantage of the fact that the hidden state moves in the stochastic model and run a test to detect level change.

And there it is, in the bottom panel, the level change just before 1900. So in the case of this mock up data it appears that a level change potentially could work in some scenarios where outlier detection would not. It also would potentially be more specific for modern doping which uses EPO micro and masking doses as well more frequent small volume withdrawals and transfusions.