Who, What, Why: How do you reassemble shredded documents?

Two pages of shredded CIA documents, courtesy of the National Security Archive Iranians spent years reconstructing US documents seized in 1979

Governments and businesses have long used shredders to destroy sensitive documents. How easy is it to reassemble the pieces?

Almost every office has one - a document shredder and a bin filled with strips of paper fit for the bottom of a birdcage.

But in war time, the shredded pages found in a captured bunker or command post could contain intelligence, if the thousands of pieces could be reassembled.

After Iranian students seized the US embassy in Tehran in 1979, they spent years painstakingly reassembling the intelligence reports and operational accounts shredded by the CIA officers who were the last workers captured.

Otavio Good Mr Good says his software foresees how computers can unshred pages

Now, a team of computer programmers from California have developed software they say shows that computers can, in theory, do most of the hard work.

It works by matching up individual shreds based on minuscule clues in each shred - the contour of the tears, a barely-visible watermark, and traces of writing, for instance - and can work incalculably faster than a human undertaking the same task.

It was the successful entry in a document shredder competition launched this autumn by the US military, in an attempt to encourage research on what is essentially a maths problem - how to assemble a puzzle efficiently.

In October, the Defense Advanced Research Projects Agency (Darpa), the Pentagon's research arm, offered $50,000 (£31,961) to the first team to reassemble five shredded hand-written documents and answer the puzzles contained in each of them.

"Any time you're in conflict or in war and you were to take over a building or a compound, it wouldn't be terribly surprising to have the enemy try to destroy or shred their documents," says Dan Kaufman, director of Darpa's information innovation office.

"How can we quickly put them together and get some value and try to save some lives?"

The answer

  • Manually - with large dedicated teams, lots of space and years to work
  • A standard office shredder reduces a 30-page document to more than 12,000 pieces
  • If each rectangular piece has four sides, that is a total of 1.2bn possible two-piece matches
  • Sophisticated "computer vision" software can - in theory - recommend possible two-piece matches to a human user who verifies the match is correct

A decent commercial shredder can reduce a sheet of paper to more than 400 pieces. That yields a total of 1,276,800 possible two-piece combinations - for one single-sided sheet.

Most office documents are a lot longer, many are printed on both sides, and the bin containing the shreddings could hold the remnants of hundreds of pages.

The last embassy workers captured by the Iranian students on 4 November 1979 included a team of CIA officers who had locked themselves in a vault in order to burn and shred sensitive embassy documents, says Malcolm Byrne, deputy director of the National Security Archive, a research organisation at George Washington University.

"When those guys gave themselves up, they left all the stuff in there thinking, 'okay, we've done our job,'" he says.

WHO, WHAT, WHY?

Question mark

A part of BBC News Magazine, Who, What, Why? aims to answer questions behind the headlines

Instead, the Iranians laid the shreds out on a floor and devised a sophisticated procedure for numbering, indexing and reassembling the individual shreds, Mr Byrne says.

"Certainly it took a number of years for them to finish the process," he says.

The security forces later published the reconstructed documents in book form and sold copies all over Tehran, he says. And agents used the intelligence they gathered to identify and kill CIA collaborators.

The Darpa competition opened on 27 October, and more than 9,000 teams entered from across the US.

Each of the five shredded documents were presented online in high-quality digital format. Some documents were more than a single page and some had pieces missing.

The winning team was a group of California computer programmers led by Otavio Good, a former video game developer.

Darpa's shred puzzle Before the unshredding software - and after

He and his partners developed software that analysed the digital images of the shredded documents, using a concept called computer vision.

"We get the computer to look at where the ink is on the page and the shape of tear on the page," says Good, 37.

To reconstruct the document, a human user clicks on an individual piece that has been ingested into the software, then selects which side of the piece to check for a match. The software then recommends possible matches from the remaining unmatched pieces.

This continues until all the pieces have been matched up.

"The process was more about having a human verify what the computer was recommending," he says.

It took the team, called All Your Shreds Are Belong to US, about a month to develop and revise the software, and he estimates they spent about 600 man-hours on the programming and eventual solution of the puzzles.

"We basically spent every hour outside of work, working on this for a month," Good says.

"And we did it for the competition. If we were doing it for the money we would have come out a lot lower than we could have just doing contract work."

Good says the software the team developed has little potential as an off-the-shelf product for use by the world's militaries and intelligence agencies.

"I would call what we did a proof of concept," he says. "We put this stuff together very quickly... and it's very specifically tailored to each puzzle."

The Darpa documents were far simpler and neater than in a real case. All the sheets were single-sided, all the pieces were laid right-side-up in an orderly fashion to make them easier to work with.

And the shreds were unmarred by, say, smoke, mud or blood, unlike pieces captured in the field.

Good said he approached the competition as a programming challenge and was "neutral" about the fact he was using his skills potentially to aid spies and soldiers.

"What we've done here is we've set the bar for where the security's at," he says.

"A lot of these shredders are maybe not as secure as you thought, and maybe you should get a better shredder if you want these really and truly not to be assembled."

Features & Analysis

  • Stained glass of man with swordFrance 1 England 0

    The most important battle you have probably never heard of


  • Golden retriever10 things

    Dogs get jealous, and nine more nuggets from the week's news


  • Pro-Israel demonstrators shout slogans while protesting in Berlin - 25 July 2014Holocaust guilt

    Gaza conflict leaves Germans confused over who to support


  • The emir of Kuwait Sheikh Sabah al-Ahmad al-Jaber al-SabahFreedoms fear

    Growing concern for rights as Kuwait revokes citizenships


Elsewhere on the BBC

  • CastleRoyal real estate

    No longer reserved for kings and queens, some find living in a castle simply divine

Programmes

  • Leader of Hamas Khaled MeshaalHARDtalk Watch

    BBC exclusive: Hamas leader on the eagerness to end bloodshed in Gaza

BBC © 2014 The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.