The man who got rich on data - years before Google

By Tim Harford
Presenter, 50 Things That Made the Modern Economy

  • Published
glasses with data

Amazon, Alphabet, Alibaba, Facebook, Tencent - five of the world's 10 most valuable companies, all less than 25 years old - and all got rich, in their own ways, on data.

No wonder it's become common to call data the "new oil". As recently as 2011, five of the top 10 were oil companies. Now, only ExxonMobil clings on.

The analogy isn't perfect. Data can be used many times, oil only once.

But data is like oil in that the crude, unrefined stuff is not much use to anyone.

You have to process it to get something valuable. You refine oil to make diesel, to put it in an engine.

With data, you need to analyse it to provide insights that can inform decisions - which advert to insert in a social media timeline, which search result to put at the top of the page.

Imagine you were asked to make just one of those decisions.

Image source, Getty Images

Someone is watching a video on YouTube, which is run by Google, which is owned by Alphabet. What should the system suggest they watch next? Pique their interest, and YouTube gets to serve them another advert. Lose their attention, and they will click away.

You have all the data you need. Consider every other YouTube video they have ever watched - what are they interested in? Now, look at what other users have gone on to watch after this video.

Weigh up the options, calculate probabilities. If you choose wisely, and they view another ad, well done - you've earned Alphabet all of, ooh, maybe 20 cents (15p).

Clearly, relying on humans to process data would be impossibly inefficient. These business models need machines.

In the data economy, power comes not from data alone but from the interplay of data and algorithm.

50 Things That Made the Modern Economy highlights the inventions, ideas and innovations that helped create the economic world.

In the 1880s, a young German-American inventor tried to interest his family in a machine to process data more quickly than humans could manage.

Herman Hollerith had designed the machine but needed money to test it.

Picture something that looks a bit like an upright piano but instead of keys, it has a slot for cards, about the size of a dollar bill, with holes punched in them.

Image source, Getty Images
Image caption,
Herman Hollerith's tabulator and sorter box, used to process the 1890 United States census

Facing you are 40 dials, which may or may not tick upwards after you insert each new card.

Herman Hollerith's family didn't get it. Far from rushing to invest, they laughed at him. Hollerith evidently did not forgive - he cut them off. His children were to grow up with no idea they had relatives on their father's side.

Hollerith's invention responded to a very specific problem. Every 10 years, the US government conducted a census. That was nothing new. Governments through the ages have wanted to know who lives where and who owns what, to help raise taxes and find conscripts.

But if you're going to send a small army of enumerators around the country, it must be tempting to ask about an ever wider range of things. What jobs do people do? Any illnesses or disabilities? What languages do they speak?

Knowledge is power, as 19th Century bureaucrats understood just as well as 21st Century platform companies.

Image source, Getty Images

Yet with the 1880 census, the bureaucrats had swallowed more data than they could digest.

The census had been expanding to include libraries, nursing homes, crime statistics and many other topics. In 1870, the census had five different kinds of questionnaire. In 1880, it had 215.

It soon became clear adding up the answers would take years - they'd barely have finished this census when it would be time to start the next one.

A lucrative government contract surely awaited anyone who could speed the process up.

Young Herman had worked on the 1880 census, so he understood the problem.

He had decided to seek his fortune by inventing a new kind of brake for trains. As it happened, a train journey helped him to solve the census problem instead.

Rail tickets were often stolen. So railway companies found an ingenious way to link them to the person who'd bought them: a "punch photograph".

Image source, Getty Images

Conductors used a hole-punch to select from a range of physical descriptors - as Hollerith recalled: "Light hair, dark eyes, large nose et cetera." If a dark-haired, small-nosed scoundrel stole your ticket, he wouldn't get far.

And after observing this system, Hollerith realised people's answers to census questions could also be represented as holes in cards.

That could solve the problem, because punched cards had been used to control machines since the early 1800s - the Jacquard loom wove patterned fabric based on them.

Image source, Getty Images
Image caption,
Weaver and inventor Joseph Marie Jacquard demonstrating his loom complete with punched cards containing the pattern instructions

All Hollerith needed to do was make a "tabulating machine" to add up the census punch cards he envisaged.

In that piano-like contraption, a set of spring-loaded pins descended on the card; where they found a hole, they completed an electrical circuit, which moved the appropriate dial up by one.

Happily for Hollerith, the bureaucrats were more impressed than his family. They rented his machines to count the 1890 census, to which they'd added yet more questionnaires.

Compared with the old system, Hollerith's machines proved years quicker and millions of dollars cheaper.

More importantly, they made it easier to interrogate the data. Suppose you wanted to find people aged 40 to 45, married, and working as a carpenter. No need to sift through 200 tonnes of paperwork - just set up the machine and run the cards through it.

More things that made the modern economy:

Governments soon saw uses far beyond the census.

"Across the world," says historian Adam Tooze, "bureaucrats were inspired to dream of omniscience." America's first social security benefits were disbursed through punched cards in the 1930s.

Businesses, too, were quick to see the potential. Insurers used punched cards for actuarial calculations, utilities for billing, railways for shipping, manufacturers to keep track of sales and costs.

Hollerith's Tabulating Machine Company did a roaring trade. You may have heard of the company that, through mergers, it eventually became: IBM.

It remained a market leader as punched cards gave way to magnetic storage, and tabulating machines to programmable computers. It was still on the list of the world's 10 biggest companies a few years ago.

But if the power of data was apparent to Hollerith's customers, why did the data economy take another century to arrive?

Image caption,
Voice-activated smart speakers capture ever greater amounts of data about us

Because there's something new about the kind of data that's now being compared to oil - the likes of Google and Amazon don't need an army of enumerators to collect it.

We trail it behind us every time we use our smartphones or ask Alexa to turn the light on.

This kind of data is not as neatly structured as the pre-defined answers to census questions precision-punched into Hollerith's cards. That makes it harder to make sense of. But there's unimaginably more of it.

And as algorithms improve, and more of our lives are lived online, that bureaucratic dream of omniscience is fast becoming corporate reality.

The author writes the Financial Times's Undercover Economist column. 50 Things That Made the Modern Economy is broadcast on the BBC World Service. You can find more information about the programme's sources and listen to all the episodes online or subscribe to the programme podcast.