How can you tell if a computer can think? In 1950, Alan Turing – the father of computer science – suggested a simple test. Step one: design a computer program that can simulate human conversation. (Which was no mean feat, given how primitive the computers of the mid-twentieth century were.) Step two: place it behind a screen or otherwise conceal it from view. Step three: invite a human being to converse with the computer in the form of text messages. Step four: Ask the person whether their unseen interlocutor is a fellow human or a machine.

If he or she mistakes the computer-generated conversation for a human, then voila: the computer, according to Turing, can be said "to think".

It sounds more like a parlour game than a thought experiment, and many in the field regard it as just that. Nevertheless, this "Turing test" went on to drive decades of research on artificial intelligence. It even spawned an annual contest since 1991 called the Loebner Prize, where judges hold short conversations with concealed artificial-intelligence programs and humans, and then have to decide which is which. (In the absence of any program passing the test, a smaller prize is given to the most “humanlike” one.)

Turing predicted that a computer would successfully fulfill his thought experiment before the year 2000, but to this day, no computer program has passed the test – not even yesterday’s winner of the 2012 Loebner Prize, Chip Vivant. This is partly due to the mushy parameters of the test itself – for instance, how long must the computer engage in freewheeling conversation before the human passes judgment on its identity? Five minutes? Three hours? Turing never says – but also because flawlessly imitating human conversation turns out to be more complicated than anyone expected. So what would it take to build a machine that can pass Turing's infamous test?

Mind your language

One thing is for sure: brute-force logic will not do the trick. In the early days of artificial intelligence research, "thinking" was assumed to be simple matter of connecting symbols together using discrete rules. "There was this idea in the 1960s of cutting the world up into objects and actions that you can name: book, table, talking, running," says Robert French, a cognitive scientist at The French National Centre for Scientific Research. "All the words in the dictionary are symbols that refer to the world. So if you put them all together in a careful manner, intelligence should emerge, roughly."

Except it doesn't. This approach, called "symbolic AI," snaps like a twig when subjected to the slightest bit of ambiguity. After all, no dictionary rule can tell a computer how to appropriately respond to the casual question, "What's up?" (If you answer "Up is the opposite of down," you have just failed the Turing test.) A densely interconnected database may contain "intelligent" information, but it is not intelligent itself. As Brian Christian, author of The Most Human Human: What Artificial Intelligence Tells Us About Being Alive, puts it: "When we read a book, we don't think the book has the ideas."

A much better way to simulate human conversation is to sidestep logic and aim for a quality called "statelessness". In a stateless conversation, each response only has to vaguely reference the one that came immediately before it. This behaviour is much easier to program into a computer, which is why so-called “chatbots” have become so prevalent online.

In the mid-1960s, ELIZA, one of the world's first chatbots, effectively impersonated a psychotherapist by parroting users' language back at them. In the 2000s, a more sophisticated chatbot called ALICE won the Loebner Prize three times by using essentially the same technique. Still, these stateless interactions are hardly what anyone would call "intelligent" – ironically, it is their almost total vacuousness that makes them seem so human. But not human enough, it seems: ALICE may have outperformed other chatbots, but it still could not fool human judges consistently enough to pass the Turing test.

So if knowing plenty of facts and making hollow small talk are not skills that allow a computer to pass the Turing test, what aspect of "human-like intelligence" is being left out? The one thing we have that computer programs do not: bodies.

Total recall

According to French, intelligence actually floats on a "huge sea of stuff underneath cognition" – and most of it consists of associative sensory experiences, the "what it's like"-nesses built up by a physical body interacting with a physical world. This "subcognitive" information could include the memory of falling off a bike and skinning your knee, or biting into a sandwich at the beach and feeling sand crunch between your teeth. But it also includes more abstract notions, like the answer to the following question: "Is 'Flugly' a better name for a glamorous actress or a teddy bear?"

Even though "flugly" is a nonsense word, almost any English-speaking human would pick the teddy bear, says French. Why? "A computer doesn't have a history of embodied experience encountering soft teddy bears, pretty actresses, or even the sounds of the English language," French says. "All these things allow human beings to answer these questions in a consistent way, which a computer has no access to." Which means any disembodied program has an Achilles heel when it comes to passing the Turing test.

But that may soon change. French cites "life-capturing" experiments, like MIT researcher Deb Roy's ongoing efforts to record every waking second of his infant son's life, as a possible way around the embodiment problem. "What would happen if a computer were exposed to all of same the sights and sounds and sensory experiences that a person was, for years and years?" French says. "We can now collect this data. If the computer can analyse it and correlate it correctly, is it unreasonable to imagine that it could answer 'Flugly'-type questions just like a human would?"

French does not think on-the-fly analysis of such a massive dataset will be possible anytime soon. "But at some point in time, we're going to get there," he says. At which point – assuming it works, and a computer program passes the Turing test – what will this mean in practical terms? Will we deem the device intelligent? Or will we simply add “have a convincing conversation” to the ever-growing list of interesting things that computers can do, like “beat humans at chess” (as IBM’s Deep Blue did in a match with Garry Kasparov in 1997) or “play Jeopardy! on television” (as Watson did in 2011)?

The renowned computer scientist Edsger Dijkstra said that "the question of whether machines can think is about as relevant as the question of whether submarines can swim." Semantics and philosophy of "intelligence" aside, a computer that can pass the Turing test can do exactly one thing very well: talk to people like an individual. Which means that the Turing test may simply be replaced by a different question, one that is no less difficult to answer. "We can imitate a person," says Brian Christian. "But which one?"

If you would like to comment on this story or anything else you have seen on Future, head over to our Facebook page or message us on Twitter.