A statistically modelled wedding
It's the classic dilemma for anyone planning a wedding. The list of friends and family you'd like to invite is seemingly endless. Your budget is not, and neither is the venue's floor space. Could one couple have found a solution via statistical modelling?
Damjan Vukcevic and Joan Ko, planning their wedding in Melbourne, Australia, were struggling to draw up an invitation list of family and friends in places as far-flung as Serbia, Tawain, the UK and the US.
"We're both from immigrant families, so the guest list had people from all around the world and, while it's easy to work out whether people in Melbourne - our home town - could come to the wedding, it was a much more difficult task to work out what proportion or how many of the overseas guests would show up," Joan says.
"So the challenge is to pick the right number of invitees so that we could get all our friends there, but to not overshoot the mark - and to stop my parents from inviting their friends willy-nilly, as well."
The venue they had chosen could fit 110 comfortably. Damjan and Joan calculated that having fewer than 100 guests meant they would be wasting the opportunity to celebrate with more friends, while having more than 110 guests would be too much of a tight squeeze.
They could have done what many people do - post the first batch of invitations, wait for replies and then send out a second round and, maybe, a third - but this wasn't an option the couple were keen on.
"Some people might get an invitation in the second round and if they're comparing notes they might wonder, 'Oh, what's going on here? Are we not in the best tier of friends, so to speak?'" Damjan says.
So one morning, Joan awoke to the find that Damjan had been up late building a solution - a statistical model.
The couple started to list all the people they might want to invite to their wedding. They divided them into four categories, depending on how far away they were and how firm the friendship was, and then estimated the probability that a guest in each category would show up.
They made some other assumptions, too - for instance, that couples would either both come, or both stay away, and that there were no other influences, such as FA cup finals, high airline prices, or rival weddings which might simultaneously influence lots of guests to stay away.
Damjan and Joan plugged the numbers into the spreadsheet, and then sent out 139 invitations in the expectation they would receive between 100 and 110 acceptances.
As they waited for the replies to drop through the letter box, they were confident that, statistically speaking, success was almost certain.
Their model gave a 95% prediction interval of 102-113 acceptances. It suggested the most likely result would be 106 attendees.
So how many people finally did show up?
Exactly 105 - a triumph for statistical modelling!
Or maybe not. Because of the 105 people went to the wedding, only 97 had been on the original guest list and in the spreadsheet.
"We had that problem that weddings often have - people showing up who probably didn't get their invitation in the first round," Joan says.
"So I think you should always have a buffer for people with new partners that you didn't know about, or friends and family that your parents have accidently invited."
Damjan and Joan had made two sizeable statistical errors.
Firstly, they were too optimistic about how many people would accept their invitation. They assumed a 100% acceptance rate among the people living in their home town, which didn't turn out to be the case, and they were also surprised by how few of their overseas guests could make it.
Secondly, they didn't take account of the fact that there will always be people who weren't on the original list, who will come anyway.
But although the model was wrong, they still had just the right number of guests celebrate their marriage with them.
"All the errors cancel out," Joan laughs.
And undeterred, they've been further honing the model on their friends' weddings and hope they could have it developed into an app one day.
Meanwhile, the key message for all you statistical modellers out there remains - if you can't be right, then be lucky.