It started with a splash, but the ripples continue to be felt.

When 16-year-old Chinese swimmer Ye Shiwen touched home to win a gold medal in the 400m individual medley, accusations began to fly.

She had beaten her personal best in the final by about five seconds and her world championship winning time from a year earlier by seven seconds, leading one American swimming coach to label it “suspicious”. Suggestions of foul play were rife. Ye, who has never tested positive for performance enhancing drugs - in or out of competition - called the accusations “sour grapes”. The International Olympic Committee declared her post-race drug test clean.

Her world record-setting victory has now been dissected more times than a medical school cadaver. Sports commentators, scientists, and swimming fans have produced charts, statistics and thousands of words both defending and questioning the authenticity of Ye’s performance.

The controversy hasn’t ended there, as I found out when I wrote a short explainer article for Nature News. While the arguments and counter arguments continue, the debate has shined light on a little-studied area of sports science called “performance profiling”.

This fledgling field melds sports statistics and computer modelling to flag performances that on the surface seem to defy human physiology or an athlete’s career trajectory. Profiling has already highlighted past wins that were heavily suspected of involving some form of cheating. But performances as they happen in the pool, track or road is a different matter – how can authorities instantly distinguish a clean athlete who has put in a superlative performance from one who has cleverly evaded drug tests? When opinions and conjecture can make or break an athlete’s reputation, not to mention relationships between sporting nations, can science reveal when fast is too fast?

As Australian swimmer Ian Thorpe told the BBC, he and other elite swimmers like Michael Phelps took large chunks out of their personal best times in their youth. And at the same games 15-year old Lithuanian swimmer Ruta Meilutyte improved her personal best by three seconds in the 200 metres freestyle event – a comparable amount to Ye.

Ye’s final 100m – and her last 50m in particular – has become the most scrutinised leg of her race. Reporters and commentators were quick to point out that she swam her final 50m faster than American Ryan Lochte’s gold-medal-winning performance in the men’s event. It’s an incredible, though not unheard of, feat for a female. But Ye’s supporters have said that comparing like-for-like in terms of the last 50m times doesn’t tell the whole story.

Lochte had a huge lead going into the final lap and may have eased up as a result, whereas Ye was down nearly one second at the start of the freestyle leg. Her performance could have been the product of an extraordinary, yet inexperienced athlete who didn’t know how to pace herself. However Ross Tucker, an exercise physiologist at the University of Cape Town, South Africa noted on his blog that athletes performing at their physiological limit tend to slow down as fatigue sets in, whereas Ye seemed to have a lot left in her tank after swimming 300m a couple seconds off world record pace. Also, her final 100m was only 10% slower than the best times in the women’s 100m freestyle swim; typical 400m individual medley performances are 18-23% slower.

Unusual trends

Using these types of statistics to establish whether a single performance is down to fair means or foul can raise questions but not answer them, say advocates for performance profiling. Critics may put an exceptional performance down to doping, but many other factors influence an individual run, swim or shot put, such as training, weather, sleep and diet.

What performance profiling can do is to help scientists and authorities determine the range of what is “acceptable”, and then use this information to screen and select athletes who might warrant closer scrutiny. Profiling could be particularly useful when athletes are training and most likely to dope, says Yorck Olaf Schumacher, an exercise physiologist at the Medical University of Freiburg, Germany. “It’s all about narrowing down a large collection of athletes to suspect ones.” It’s economically unfeasible to monitor every athlete 365 days a year, and performance profiling could help anti-doping authorities apportion limited resources.

One sport that has sought to analyse performances on such a scale is also one of its most tainted: road cycling. Since 2008, professional road cycling has taken a similar approach to performance profiling with the “biological passport,” which tracks physiological indicators associated with blood doping over an athlete’s career. A suspect passport led to more frequent tests for Italian cyclist Antonio Colom, and in 2009 targeted anti-doping tests turned up the banned blood-cell boosting drug erythropoietin (EPO). He received a two-year ban.

Performances in cycling’s biggest races during this period reveal some unusual trends, according to a 2010 study in the Journal of Sports Sciences that analysed the average speeds of the top finishers in 11 races over 116 years. Performances tended to improve over time in a series spurts before plateauing. These spurts occurred around World War I, at the dawn of pro cycling; between 1919 and 1939, when improved training and lighter bicycles joined the peleton; after World War II, when soldier-cyclists returned home; and after 1989 when cyclists began taking EPO, a banned artificial hormone that gives athletes greater stamina. Between 1989 and 1997, the average distance of the Tour de France increased from 3,285km to 3,944km (2,040-2,450 miles) and featured 17,000m (55,770 feet) of additional climbing – the equivalent of two Mount Everests. Average speeds should have slowed 11.3% during this period, instead they rose 4.5%.

Historical performance profiling suggests that athletes in other endurance sports may have been culpable too. In a 2009 paper titled Performance Profiling: A Role for Sport Science in the Fight Against Doping?, Schumacher and a colleague uncovered similar trajectories in distance running. Men’s 5,000m and 10,000m times fell dramatically in the 1990s and began rising again in the 2000s, only after a test for EPO was developed by anti-doping scientists.

EPO and other blood boosters aren’t the only performance enhancers whose introduction created blips in sports statistics. Schumacher’s study found that women’s discus throws became dramatically longer in the 1960s, 70s and 80s during a boom in steroid use; they came down to earth in the late 1980s when authorities introduced out-of-competition testing.

Data crunch

Performance profiling is already happening informally, says Tucker, because athletes who turn out gold-medal-winning performances are tested more frequently in and out of competition than also-rans. But proponents of profiling argue that statistical modelling, not just success, should be used to identify the athletes to watch most closely.

In a 2010 paper published in the open access journal Plos-One, Geoffroy Berthelot, a computer science researcher at Irmes Insep in Paris and his team studied thousands of top track and field and swimming performances recorded between 1891 and 2008, and came up with three statistical measures for how atypical an individual race or swim was, leaving any doping accusations aside. Their metrics looked at how much of an outlier a performance was from other athletes, as well as how long it took another athlete to eclipse a time. Florence Griffith Joyner’s 1998 world record of 10.49sec in the women’s 100m sprint scored high on the latter rating. Joyner, who died in her sleep from a sudden epileptic seizure in 1998, never tested positive for performance enhancers but her career was stained with the suggestion she used steroids.

Picking out peculiar performances with 10, 20, even 30 years of hindsight and data is easy compared to spotting them in real-time, says Berthelot, who is working on developing metrics that could be used without decades of data. Atypical performances, he says, stand out when compared to an athlete’s prior performances. Joyner, for instance, ran her fastest times in her late 20s and early 30s, an unusual trajectory for a sprinter, says Berthelot. Though Usain Bolt has also lowered the 100m world record time considerably in a short space of time, his race times have improved smoothly and, at 25, he sits at the typical peak of a sprinter’s career.

To make performance profiling more rigorous, sports scientists must determine a typical trajectory for each sport and each event by studying data from hundreds, even thousands of careers, says Berthelot. Sprinters tend to peak young and fall off quickly, while distance runners typically peak in their late 20s and early 30s and lose their speed less quickly. With this foundation and as much data as possible on an athlete’s past performances, computer models could put a probability on the likelihood that a performance was unusual enough to warrant closer drug screening, or whether it fits the trajectory of a exceptional career.

Undercover tactics

Such models would make predictions with less certainty for young athletes such as Ye, who do not have a long career’s worth of data points, but junior records could fill the gap, says Berthelot. Another challenge is to choose carefully the athletes and eras to which they calibrate such models. Career trajectories of doped athletes won’t do a good job discerning pharmacologically enhanced performances – garbage in, garbage out, as they say. And sports technology adds another problem to benchmarking performance profiling. Just as lighter bikes improved cycling times in the 1930s, full-body suits have been behind many of the performance gains in swimming in the 2000s, says Berthelot.

Proponents of performance profiling stress that such measures ought to be used to screen athletes, not discipline them. “That would be unfair,” says Tucker. “The final verdict is only ever going to be reached by testing. It has to be.”

Some endurance sports are already exploring this possibility. In December 2008 the governing body of biathlon, a gruelling winter sport that combines cross country skiing and target shooting, sentenced world champion Dmitry Yaroshenko to a two-year ban for taking EPO. Yaroshenko was flagged by a software programme that tracks blood physiology and performance, and out of competition testing turned up proof of EPO. Tucker also knows of one pro-cycling team that voluntarily reports power measurements to cycling authorities to show that its athletes are drug-free.

Critics may counter that institutionalised performance profiling will cast a pall over great races such as Ye’s, by looking for evidence of cheating. That needn’t be the case. Statistical profiling would be automatic and undercover, just as biological passport profiling already is. Tucker and other sports scientists expect that Ye and other Olympic gold medalists will draw closer scrutiny from doping authorities. But, in a connected age of super slo-mo replays, telemetry measurements and databases, shouldn’t such attention be based on statistics instead of hunches?

If you would like to comment on this article or anything else you have seen on Future, head over to our Facebook page or message us on Twitter.