In my mathematics department at The University of Auckland, there are two Melissas – me, and Melissa Tacy.
Over the course of my two years working in Auckland, we have encountered several occasions of mistaken identity. In one case, the other Melissa’s travel was almost cancelled because I was booking travel to a different place at the same time, and just yesterday, I was sent a copy of an exam that the other Melissa is proof-reading. Yikes!
It’s very annoying, but it got me thinking. Many of us have heard of the Birthday Paradox where, amongst 23 people in a room (with birthdays chosen uniformly at random), there is an over 50% chance that two of them share the same birthday. What if we apply this same idea to people with the same name?
Unfortunately, data on the frequency of first names is not available in Australia, but it is available in the US. So for the sake of argument, I will suppose that we are looking at people in the US.
According to the website, there are 337,070,229 in the United States of America, and 778,632 of them are named Melissa. That works out to about 0.231%, a bit less than 1/365= 0.274%.
So then, how many people do you need to gather in a room before there’s over 50% chance that two of them are named Melissa? Note that this problem is a bit different to the traditional birthday problem described above – here I am requiring that two people have *my* name, not just any name in common.
Suppose that I am in the room (so that’s one Melissa sorted already). The probability that someone from the US, chosen uniformly at random, *doesn’t* share my name is (337,070,229-778,632)/337,070,229 ≈ 99.77%.
So it’s not looking good so far!
In a room with n people (not including me), the probability that *none* of them are named Melissa is:
So in order to compute when there’s more than 50% chance there’s another person with the name Melissa in the room, we can just find the first n such that the above expression is less than 0.5.
With a bit of Mathematica magic, it turns out that 301 people (including me) are sufficient!
We can repeat this process with other names as well.
According to the same website, the most common name in the US is James, with 5,608,900 people with that name (or about 1.66% of the population). If we repeat the same process, then only 43 people are required, provided that we already have a James to start with.
Of course, we are supposing here that we are choosing people uniformly at random. If you were to instead think about this calculation for people belonging to particular demographics e.g., people of your age that live in your country, the answer might change drastically due to fluctuations in name popularity over time.
Why not try working it out for your name?
I’ll leave you with one further problem – in my calculations, I supposed that a person with a particular name was already in the room before we started.
How can we repeat this calculation if we do not make this assumption? Let me know in the comments!