Beautiful and mathematical: Football as a numbers game
"Big data" - the world of analytics, algorithms and statistical models - are increasingly part of our lives, and professional sports such as football are no different.
"Why did you pick him?" "Don't take it short, get it in the box!" "Put a striker on!"
Complaining about the manager's selections and questioning players' decisions on the pitch are time-honoured traditions of being a football fan.
Whether watching from the stands, on TV or listening to the match on the radio, offering full-throated advice to the coach and exhorting the players to try harder or do something different is one of the "joys" of supporting a team.
We football fans flatter ourselves that our alternative ideas would immediately improve the team's performance. Mostly they're based on intuition and a "feel for game", often nurtured over years of watching nil-nil draws in the freezing rain at uncovered away ends.
Nowadays when player acquisitions or formations strike us as baffling or obtuse, there is likely to be method in the madness. As data on player attributes, movements and positioning become more comprehensive and analytical models more sophisticated, football is relying much less on gut instincts.
It is still "the beautiful game", but it is one that increasingly resembles a game of chess.
Data is being created at historically incomparable rates, in all conceivable areas of life. We are living at the start of the age of so-called "big data". Analytics, algorithms and statistical models are increasingly part of our lives, whether we like it or not.
Professional sports are no different. This is an extraordinarily lucrative sector, where data has been identified as potentially giving athletes and teams a competitive edge.
The data revolution in sports is often traced to Billy Beane, general manager of the Oakland Athletics or A's, an unheralded team in Major League Baseball in the US. Beane employed a method that came to be known as Moneyball after Michael Lewis published a book about the A's in 2003.
Beane used an analytical, evidence-based approach to identifying players who could meaningfully contribute to the team and offer good value for money. It drew on sabermetrics, a scientific method for analysing baseball performance pioneered by Bill James. The A's sustained success on a limited budget, later chronicled in a movie based on Moneyball starring Brad Pitt, turned the spotlight on data analysis in sports.
From baseball, these analytical methods for appraising players quickly spread to the NFL and NBA, and a number of sports in the UK. In cricket, former England coach Duncan Fletcher favoured statistical analysis of batting and bowling to identify the best way for players to score runs and to get batsmen out.
Clive Woodward's innovations in using player data helped the England rugby team to win the World Cup. Dave Brailsford's innovations in performance training data helped make Team Sky multiple Tour de France winners.
Boots on the ground
In football there were pioneers too. In fact, recording granular data on players and match events goes back further than you might think.
Charles Reep coded his first football match, counting passes and noting positions, in 1950. Valeriy Lobanovsky was doing the same in the Ukraine in the 1970s. Former England manager Graham Taylor also used a crude form of analytics to inform his long ball tactics with Watford in the 1980s.
With the launch of the Premier League in 1992, and the money and exposure brought by the Sky TV deal, a number of football data companies were launched, including Prozone in 1995 and Opta in 1996.
These early efforts were impressive for the time. For instance, the computer game Championship Manager (later renamed Football Manager), launched in 1992 with a database of 4,000 players and statistics on 30 attributes per player.
Speaking to the British Science Festival in Swansea last week, Dr Tom Markham, head of strategic business development at Sports Interactive - makers of Football Manager - said those numbers have exploded in the subsequent decades.
"The game now has a database with 319,726 current players. With former players, who may take other roles in football, it comes in at over 600,000."
Compiling that database, Dr Markham said, is a big job.
"We have people on the ground in 51 different countries covering 140 leagues. There are 2,250 fully researched clubs, with 250 statistics on each player - aggregated to 47 in the user interface.
"With 1,300 scouts, all the main clubs have one researcher, and top clubs like Chelsea have multiple experts."
Game of probability
Some Football Manager alumni have gone on to work as scouts with professional teams, he added.
As professional football revenues continue to grow, and leagues become increasingly competitive, the data industry has also expanded. Huge amounts of data from companies like Opta and Prozone underpin not only team tactics but also sophisticated media coverage.
Coaches employ wearable tech to monitor player fatigue on the pitch and in training, to prevent injuries resulting from physically overloading players. Recorded movements on the pitch inform models of formations and playing style, with simulations and in-game stats for coaches to make halftime adjustments.
Data analysis is about spotting patterns and making predictions. Recording the direction of a players' penalty shots can show which area he favours. Knowing this a goalkeeper can increase the probability of "guessing" right.
One important metric is "expected goals", a key input in betting and analytical models. It is a predicted probability of a goal coming from a shot in a particular area of the pitch. How many shots a team has from those areas can be used to predict the likelihood of scoring.
When Leicester became Premier League champions, it was a huge shock. But it is no coincidence that their use of analytics was among the most comprehensive and forward-looking in the league.
Leicester's unusual style of play, with little possession and relying on fast attacks, took many opponents by surprise. The team suffered virtually no injuries, and relied on the emergence of unheralded players like N' Golo Kante and Jamie Vardy.
Those who believe in the data-driven approach would say this is exactly the kind of comparative advantage statistics can bring.
Another of the great rituals for football fans is speculating about transfers. Who are we going to buy? Who should we buy?
Buying and selling players is a huge business. In the recently concluded summer transfer window, Premier League teams combined to spend over £1bn, with Manchester United spending in excess of £80m on a single player.
Datasets like those compiled by Football Manager have become a resource for the scouting and recruitment operations of many teams.
Finding a low-cost, high impact player like Riyahd Mahrez or Dmitri Payet can have remarkable results on the pitch. For clubs with smaller budgets, finding a rough gem or talented youngster that they can later sell for a profit is a crucial form of revenue.
"£8 million a year is the average running cost for a tier 1 academy and teams have to find talented youngsters who they can nurture and sell on," Dr Markham said.
But assessing young talent is difficult, and not every talented youngster will become a Gareth Bale, who was discovered as a boy in Wales, nurtured by Southampton's youth academy and later signed for Real Madrid for a world record fee.
Markham told another story of young talent, Martin Odegaard, the Norwegian prodigy who signed for Real Madrid at the age of 16 after making his debut for the national team at just 15.
When Football Manager came out in Norway Odegaard wasn't in the game because he was a minor, causing a metaphorical riot among Norwegian fans. He was added to the game's database when his dad tweeted a picture giving parental consent.
But how to rate the prodigy?
When the Football Manager club scout sent his rankings to the head of Norway operations, it raised a red flag. How could a 15-year old score so highly? The Norwegian chief went to see Odegaard play a dozen times before corroborating the data sent through to the London HQ, where it was again rejected as improbable.
Dr Markham says Odegaard's stats went through a dozen different checks before his astonishing grades were accepted.
Recruitment is so important to professional clubs that the average Premier League team has 7 international scouts. But clubs don't have the resources to cover players in every country - and many teams use Football Manager to inform their own scouting strategies, Dr Markham said.
Other teams are creating their own datasets, and working with other companies to come up with bespoke solutions. Teams using analytics to thrive include Brentford and the Danish club Midtjylland, both with connections to Matthew Benham, a noted convert to the data-driven analytical approach.
Aside from clubs and gamers using simulations, data underpins many other aspects of the football industry, from TV coverage to betting models and fantasy football. Using analytics to spot patterns in match results is used to monitor match fixing. Books with titles like Soccernomics and Soccermatics allow fans to get close to the "action" of data analytics.
The relationship of gaming with professional football goes both ways. Players enjoy simulations like FIFA in their frequent downtime and many players are used to receiving data on their own performance.
A picture of Paul Pogba playing Football Manager and signing himself for Chelsea set off speculation that he might move from Juventus to Chelsea.
And according to Dr Markham, those involved in the beautiful game itself can be - perhaps unsurprisingly - fixated on their representation in virtual versions like Football Manager.
He often receives messages from players and agents, he said. "Sometimes they complain about their ratings in the game, or their agents try to get them put up."