Technology

Reality Check: Was Facebook data's value 'literally nothing'?

Alex Kogan saying: The idea that this data is accurate I would say is scientifically ridiculous.

There is a huge spectrum of opinion on the value of the Facebook data that Cambridge University academic Aleksandr Kogan gave to Cambridge Analytica's parent company, SCL.

Dr Kogan told a parliamentary committee: "Given what we know now, nothing, literally nothing - the idea that this data is accurate I would say is scientifically ridiculous."

On the other hand, there have been suggestions this sort of data will allow computers to gain a profound understanding of people and their preferences.

In a news conference on Tuesday, Cambridge Analytica's spokesman said the company had also found Dr Kogan's data set to be "virtually useless".

The orthodox view among data scientists is that the use of social media data to target adverts on Facebook is in its infancy and not yet hugely effective - but Dr Kogan is going further than that, saying that it was completely without value.

Reality Check has seen Dr Kogan's unpublished research into the value of predicted personalities for micro-targeting. We judge that he is underselling its value although he is correct to say that the data was not accurate.

Personality test

Let's go back to where the data came from and what it included.

Dr Kogan had a personality testing app on Facebook, on which users would answer questions about themselves and be given scores on how they rated on the Big Five personality traits: openness, conscientiousness, extraversion, agreeableness and neuroticism, which are used by research psychologists and advertisers.

Dr Kogan says about 270,000 users took this test. Taking the test also gave the app data on all the users' friends, which created a database of 30 million people and their predicted personality scores, according to Dr Kogan. Facebook puts the figure at up to 87 million.

These personality predictions are based on the idea that, for example, if it turned out that people who liked particular brands of sports cars and nightclubs had also turned out to be extraverts, then you might predict that other people who liked those things would also be extraverts.

You can see a similar sort of system on the website of the Psychometrics Centre at the University of Cambridge, which attempts to predict your personality test result based on your social media activity.

Inaccurate predictions

Dr Kogan's research was funded by SCL, the research and communications company that formed Cambridge Analytica. Dr Kogan passed the data, including some of the pages that users had liked, to SCL.

Dr Kogan now says that the data he gave to SCL was useless for targeting adverts on Facebook because individual predictions were too inaccurate.

But some data scientists argue that the overall quality of the personality predictions is not the most important measure.

Part of the point of targeted advertising is to reduce costs by trying to appeal to only a relatively small number of users.

So you might be more interested in people turning up at the extremes of particular personality measures rather than those coming up as being close to average, because they are the ones most likely to exhibit the traits you are targeting.

As such, the overall reliability of the data may be less important than finding groups who may be targeted.

Also, Dr Kogan argues that trying to assess the personality of an individual gives too large a margin of error so the predictions are reliable only if you're taking averages across larger groups. But looking at larger groups may be helpful during an election, when you might be trying to decide where to buy advertising on local radio or where to hold an election rally, for example.

So Dr Kogan is underselling the value of his dataset. While not all of it would have been useful, parts of it could have been helpful.

Image copyright Reuters

Read more from Reality Check

Send us your questions

Follow us on Twitter

More on this story

Related Internet links

The BBC is not responsible for the content of external Internet sites