Ancient languages reconstructed by computer program

Assyrian script Unlike this Assyrian script, few ancient languages have a written record, which makes reconstructing them extremely difficult

Related Stories

A new tool has been developed that can reconstruct long-dead languages.

Researchers have created software that can rebuild protolanguages - the ancient tongues from which our modern languages evolved.

To test the system, the team took 637 languages currently spoken in Asia and the Pacific and recreated the early language from which they descended.

The work is published in the Proceedings of the National Academy of Science.

Currently language reconstructions are carried out by linguists - but the process is slow and labour-intensive.

Dan Klein, an associate professor at the University of California, Berkeley, said: "It's very time consuming for humans to look at all the data. There are thousands of languages in the world, with thousands of words each, not to mention all of those languages' ancestors.

"It would take hundreds of lifetimes to pore over all those languages, cross-referencing all the different changes that happened across such an expanse of space - and of time. But this is where computers shine."

Rosetta stone

Languages change gradually over time.

Start Quote

Sound changes are almost always regular... so patterns are left that a human or a computer can find”

End Quote Dr Dan Klein University of California, Berkeley

Over thousands of years, tiny variations in the way that we produce sounds have meant that early languages have morphed into many different descendents.

Dr Klein explains: "These sound changes are almost always regular, with similar words changing in similar ways, so patterns are left that a human or a computer can find.

"The trick is to identify these patterns of change and then to 'reverse' them, basically evolving words backwards in time."

The scientists demonstrated their system by looking at a group of Austronesian languages that are currently spoken in southeast Asia, parts of continental Asia and the Pacific.

From a database of 142,000 words, the system was able to recreate the early language from which these modern tongues derived. The scientists believe it would have been spoken about 7,000 years ago.

They then compared the computer's findings to those of linguists, finding that 85% of the early words that the software presented were within one "character" - or sound - of the words that the language experts had identified.

But while the computerised method was much faster, the scientists said it would not put the experts out of a job.

The software can churn through large amounts of data quickly, but it does not bring the same degree of accuracy as a linguist's expertise.

Dr Klein said: "Our system still has shortcomings. For example, it can't handle morphological changes or re-duplications - how a word like 'cat' becomes 'kitty-cat'.

"At a much deeper level, our system doesn't explain why or how certain changes happened, only that they probably did happen."

While researchers are able to reconstruct languages that date back thousands of years, there is still a question mark over whether it would ever be possible to go even further back to recreate the very first protolanguage from which all others evolved, or whether such a language even exists.

More on This Story

Related Stories

The BBC is not responsible for the content of external Internet sites

More Science & Environment stories

RSS

Features & Analysis

BBC Future

(Thinkstock)

Why are most of us right-handed?

A vexing mystery of the human body Read more...

Programmes

  • Digital candlesClick Watch

    Inside the 'Harry Potter' church, using technology to explore "digital empathy".

BBC © 2014 The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.