NYT Trump column: Linguistic clues to White House insider?
We all have our own distinctive style of writing and speaking. Trying to hide those quirks is like trying to repress a part of our character.
This style is what can help you identify an author from reading only one paragraph of their work. But what happens if the author doesn't want to be identified?
It is fair to say that the author of an opinion column published in the New York Times on Wednesday would rather people not know who they are.
The anonymous column, headlined I Am Part of the Resistance Inside the Trump Administration, said people working for the president were working to frustrate parts of his agenda to protect the country from his "worst inclinations".
All sorts of speculation is now swirling as to who could be behind the article.
Might it be possible to pick up clues to the writer's identity by analysing their style?
Maybe. And there are some very intriguing clues around.
But also, maybe not. We decided to give it a go anyway.
Stage one: Our methodology
We ran the text of the New York Times column through some writing enhancement software to identify the author's stylistic traits (more on those later).
The New York Times said the column was written by someone "in the Trump administration" - this could mean the White House, the Pentagon, the state department or any number of departments.
So we ran a few weeks of statements issued by certain departments through the same software, to see which of those best matched the NYT column.
We assessed only speeches or official statements attributed to a person that were pre-prepared and not off-the-cuff (this ruled out a lot of President Trump's speeches).
Stage two: The caveats
And there are many, many caveats:
- The NYT said the author was "a senior official" in the administration. Not all senior officials issue statements - many work behind the scenes, as this author may well do
- And yes, not all official statements are written by the officials themselves - this is why they have staff
- The column will have gone through the hands of New York Times editors, so we don't know how closely the published column resembled what was submitted
- Having said that, opinion editor James Dao has said the submitted piece was well-written, telling the newspaper's The Daily podcast: "I was really quite impressed by the clarity of the writing and by the emotional impact of the writing"
- Mr Dao confirmed that editors did not remove stylistic clues to the writer's identity
- We don't know if the writer is male or female: in a tweet, the NYT referred to the author as "he"; it then put out a statement saying the tweet "was drafted by someone who is not aware of the author's identity"
- Our sample size is small and this is not a scientific method, so make of our conclusions what you will...
Stage three: The conclusions
The software we used homes in on certain characteristics of writing style, including how often the writer repeats words, when they use rare words, how often and where they use punctuation, how many characters they use in each word, and how long their sentences are.
Compared with most of the official statements and speeches we analysed, the New York Times column had a distinctive style (again, some of this could be down to the editing process).
For a start, the average length of the sentences in the column is very low compared with government statements: only 19.3 words per sentence.
Compare this with statements by Press Secretary Sarah Sanders on Syria on 4 September (31 average words a sentence) and Mr Trump in a letter to the Senate on 28 August (30 words a sentence).
There is one Trump administration official whose statements and speeches are always shorter than the others - sometimes significantly.
His name is Michael Richard Pence, the vice-president of the United States of America, and on Thursday, he denied he was the author of the column. Some had suggested he was responsible because the column used one unusual word - "lodestar" - he's been known to use.
Let's look at the evidence from Mr Pence's statements:
- on 31 August before the lying in state of late Senator John McCain: 17.4 words per sentence
- at the American Legion's 100th national convention on 30 August: 17.6 words per sentence
- in Houston on 23 August on the administration plan for space: 19.7 words per sentence
Well, you might say, surely Mr Pence's speeches are written by someone else?
This is true - although it is not clear how much input the vice-president has in writing his speeches.
However, we were also able to analyse old columns written by Mr Pence when he was a radio broadcaster in the 1990s. These too show a consistent style: short, easily digestible sentences - much shorter than most government statements.
Pence's speeches and columns also show he favours shorter words than those we see in other government statements.
There is another piece of evidence pointing in the vice-president's direction.
Government statements very rarely use the passive voice, and tend to prefer using the active voice instead - there are only a handful examples of the former being used over the past few weeks.
However, the author of the column does use the passive voice, a few times:
- "Although he was elected as a Republican" instead of "Although the American people elected him as a Republican"
- "We have sunk low with him and allowed our discourse to be stripped of civility"
- "occasionally reckless decisions that have to be walked back"
Its use in comparison with the White House statements is striking. Except for all of Mr Pence's.
He used the construction seven times in his Houston speech, three times in his American Legion speech and, in one old column on why President Bill Clinton should be impeached, he uses it six times in only 916 words.
We'll carry on running more tests on more statements released over a longer period of time, by the end of which - who knows - maybe the author will have been outed.
In the meantime, Mr Pence - or at least someone writing on behalf of Mr Pence - has continued to deny he was the author.
"The vice president puts his name on his op-eds," tweeted Jarron Agen, Mr Pence's communications director and deputy chief of staff.
"The @nytimes should be ashamed and so should the person who wrote the false, illogical, and gutless op-ed. Our office is above such amateur acts."
(In case you were wondering, the software concludes that this article is very similar in style to that used by the anonymous New York Times writer.)