Clutching at Strings

Let’s do a problem together.


Problem. There are 100 pieces of string placed in front of you. You randomly grab two free ends and tie them together. You continue to do this until there are no free ends left. Now, there are just loops. How many loops do you expect to have in front of you?


I’ve said it before and I’ll say it again: scale your problems down. So let’s come up with an easier version of the same problem.

In this case, the obvious way to scale down is to work with fewer pieces of string.

One String

So let’s consider the same problem but for one piece of string. This is trivial as there are only two free ends (the two different ends of the same string) and so the only thing you can do in this case is to tie these two ends together to make a single loop.

Therefore, the answer to our problem for the simple case of one string is one loop.

Notation

Being mathsy folk, we should create some notation to help with our record-keeping. Let’s write E_n to mean the expected number of loops in the n-string scenario.

So far, we have worked out that E_1 = 1. The original problem is asking us to determine the value of E_{100}.

The one-string scenario was not particularly illuminating, so let’s step it up.

Two Strings

What about if we have two pieces of string? Here, we can follow one of two paths:

  1. You grab two ends of the same piece of string and then tie them together. This makes a loop. The rest of this scenario is locked in as you have to grab the remaining two free ends of the other piece of string and tie them together to make a loop. This gives you two loops.
  2. You grab two free ends, each belonging to a different piece of string. You tie these ends together and this leaves you with a single long piece of string. Then your next and only move is to tie the ends of this long piece of string together. This gives you one loop.

There are two paths to follow in the two-string experiment and they each give a different result.

To calculate the expected value E_2, we need to multiply the probability of each path with the result of that path. We could write something like this:

E_2 = p_1 L_1 + p_2 L_2.

In the above equation, p_1 means the probability that we go down path 1, which is where we start by grabbing two ends of the same string. We use L_1 to denote the number of loops we end up with when we go down this path (we already know that L_1 = 2.)

Similiarly, p_2 and L_2 denote the probability and the number of resulting loops when we go down the second path (we already know that L_2 = 1).

Really, we just need to work out the probabilities and then we can calculate our expected value of loops for the two-string case.

What is the probability p_1, that is, what is the probability that we grab two ends of the same piece of string?

I like to live my maths problems, so I’m going to imagine that my left hand has grabbed the end of a piece of string. I want to know the probability that my right hand grabs the other end of that same piece of string. There is a 1/3 chance of this happening, as there are three free ends remaining and only one of those is the free end on the same piece of string. Therefore, p_1 = 1/3.

Immediately, we also know that p_2 = 1 - p_1 = 2/3. And so we have that

E_2 = \frac{1}{3} \cdot 2 + \frac{2}{3} \cdot 1 = \frac{4}{3}.

The two-string problem is now solved!

At this point, it might feel like we are grasping at straws (or strings!) to find an idea that will help us solve the 100-string problem.

But it is there. Think about the two paths we could follow when we were sitting there with two free pieces of string. After one step, we would either have one long piece of string, or a loop and a piece of string.

This idea scales quite well to the case with 100 pieces of string.

100 Strings

Consider this. You are sitting there with 100 pieces of string. Just like in the two-string experiment, one of two things can happen when you tie two ends together.

  1. You grab two ends of the same piece of string and then tie them together. This makes a loop. Now you have one loop and 99 pieces of string.
  2. You grab two free ends, each belonging to a different piece of string. You tie these ends together to make a longer piece of string. Now you have 99 pieces of string.

Something smells like recursion. From looking at the above two paths, it’s clear that a solution to the 99 string problem will yield a solution to the 100 string problem.

As there are two paths we can go down, again we may write something like this:

E_{100} = p_1 L_1 + p_2 L_2.

Just like before, p_i means the probability that we go down path i and we use L_i to denote the number of loops we end up with when we go down this path.

Note that for path 1, we end up with 1 loop plus 99 pieces of string. Therefore, we have that L_1 = 1 + E_{99}.

For path 2, we end up with 99 pieces of string, and so we have that L_2 = E_{99}.

We can calculate p_1 and p_2 in a similiar way to the 2-string problem. You can double check my working, but I get that p_1 = 1/199 and p_2 = 198/199.

We now have all the information we need to substitute this into the equation

E_{100} = p_1 L_1 + p_2 L_2.

We get

E_{100} = \frac{1}{199} (1 + E_{99}) + \frac{198}{199} E_{99}.

A smidge of expansion gives us that

E_{100} = E_{99} + \frac{1}{199}.

Now, we need to find E_{99}. It’s looking pretty likely that we can express E_99 in terms of E_{98}. Indeed, if you go through all the working yourself, you will surely find that

E_{99} = E_{98} + \frac{1}{197}.

We can throw the previous two equations into the oven together to get:

E_{100} = E_{98} +\frac{1}{197} + \frac{1}{199}.

At this point, you are likely seeing the problem open up before your own eyes. You might even guess that

E_{100} = E_{97} + \frac{1}{195} +\frac{1}{197} + \frac{1}{199}

and certainly you would be correct.

You can do some mathematical induction, or you can just live on the wild side like me and jump to the bit where

E_{100} = E_{1} + \frac{1}{3} + \frac{1}{5} + \cdots + \frac{1}{195} +\frac{1}{197} + \frac{1}{199}.

Fortunately, we have already figured out E_1, and so we have that

E_{100} = 1 + \frac{1}{3} + \frac{1}{5} + \cdots + \frac{1}{195} +\frac{1}{197} + \frac{1}{199}

or, in shorthand:

E_{100} = \sum_{n=1}^{100} \frac{1}{2n-1}

So we have our answer in the form of a sum.

You can punch this into a computer or (if you think all maths should be done by hand) you can use the integral

\int_1^{100} \frac{dx}{2x-1}

as an approximation. The integral gives you an answer of 2.65 but the real answer from the sum is actually 3.28.

In any case, it’s quite a small number of expected loops!

3 comments

  1. Lovely problem, I certainly wouldn’t have guessed harmonic sums had much to do with it. I would like to point out that there is a lovely little unstated use of the tower rule of expectation somewhere in here, in a form I don’t think I’ve ever seen it before. Bonus points if you can spot it.

    Like

    • I will level with you. I had to Google the tower rule of expectation. When I saw the wiki page and definition I was like “oh it’s that thing that I just assume to be true”. Proof is very quick though, once you write everything as sums.

      Like

      • Yep, like most “laws” in probability and statistics, they’re true because they damn well should be, and the ‘should be’ usually pops out of a few basic properties of sums/integrals/measures(take your pick given the sample space at hand).

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s