March 18, 2008

The Math of March Madness

Unless your college basketball antenna is completely out of order, you are probably aware that 'bracket season' has begun. This week, millions of Americans are penciling in their annual predictions for the NCAA Basketball Tournament, better known as March Madness, which essentially begins on Thursday. If you haven't yet participated in this indispensable slice of Americana, here is a blank bracket you can fill out.

Between now and April 7th when the champion is crowned, 63 games played over three weekends of basketball bliss will reduce the field of 64 teams (I know, it's technically 65 but I never include the all but meaningless play-in game) down to one. Four gigantic regionals, each consisting of teams seeded 1 through 16, serve as the treasure map leading to hoops glory. As discussions of #1 seeds, upsets and predictions are made, a question often arises: "What are the chances of guessing the entire bracket correctly? Has anyone ever done it?"

Those are valid inquiries indeed. After all, there is a 1 in 64 chance of picking the national championship winner. The fact that no team seeded worse than 8th (Villanova in 1985) has ever cut down the nets reduces the list of possible winners down to 32 (as there are 4 teams of each seed). Besides, there have been numerous instances when someone (including yours truly in 2004) has accurately picked the fabled Final Four teams. Surely then, it must be at least possible that someone somewhere could pick the winner of all 63 tournament games. With millions of people making their guesses every year, it shouldn't be long before some unsuspecting Grandma in Looneyville, West Virginia (a real town by the way) fills out the perfect bracket by chance. Right?

Such a theory, while grand and glorious, is more than unlikely. In fact, the mathematical probability of the perfect bracket is a mind-boggling 4.2 BILLION to one, by far the worst odds of any lottery in world history. So that's why Yahoo is offering 5 million bucks to anyone who fills out a perfect bracket (and only 10 grand for most accurate bracket in the land). With odds of 4.2 billion to one, that means that every single man, woman, child, dog, cat, rabbit, hamster, squirrel and skunk in America would have to each submit 14 uniquely separate brackets that no other man, woman, child, dog, bird, beast, loon etc. had already filled out in order for that one lucky individual to emerge with a golden ticket as the Charlie Bucket of bracketology. But 4.2 billion ways to fill out a bracket? How could there be so many possibilities from such an innocent looking single-page tournament diagram? If you are brave enough to continue, here's the math:

If the tournament only had four teams (we'll call them A, B, C and D), there would be 4 possible scenarios for the final game, assuming that Team A plays against Team B while Team C plays against Team D in the semis.

(A vs. B) + (C vs. D) =

1. A vs. C
2. A vs. D
2. B vs. C
4. B vs. D

Simple enough. Well, it gets a bit trickier for an 8-team bracket with four first round games:

(AB) (CD) (EF) (GH)

With 8 teams in the tournament, there are now 16 possibilities for the first round games alone. Each letter represents who would win each of the 4 games:

1.ACEG 5.ADEG 9.BCEG 13.BDEG
2.ACEH 6.ADEH 10.BCEH 14.BDEH
3.ACFG 7.ADFG 11.BCFG 15.BDFG
4.ACFH 8.ADFH 12.BCFH 16.BDFH

With me so far? Let's look at a 16-team bracket with 8 games in the first round:
(A vs. B) (C vs. D) (E vs. F) (G vs. H) (I vs. J) (K vs. L) (M vs. N) (O vs. P)

By adding 8 more teams, we've added 240 more permutations! Even if we're only looking at the first round, there are still 256 permutations:

128 of them begin with A (the other 128 begin with B)
64 begin with AC
32 begin with ACE
16 begin with ACEG
8 begin with ACEGI
4 begin with ACEGIK
2 begin with ACEGIKM

To give you a feel for the number of possibilities, here are the first 32 permutations. Keep in mind, this is just for the first round of a tournament with 16 teams!

1.ACEGIKMO 9.ACEGJKMO 17.ACEHIKMO 25.ACEHJKMO
2.ACEGIKMP 10.ACEGJKMP 18.ACEHIKMP 26.ACEHJKMP
3.ACEGIKNO 11.ACEGJKNO 19.ACEHIKNO 27.ACEHJKNO
4.ACEGIKNP 12.ACEGJKNP 20.ACEHIKNP 28.ACEHJKNP
5.ACEGILMO 13.ACEGJLMO 21.ACEHILMO 29.ACEHJLMO
6.ACEGILMP 14.ACEGJLMP 22.ACEHILMP 30.ACEHJLMP
7.ACEGILNO 15.ACEGJLNO 23.ACEHILNO 31.ACEHJLNO
8.ACEGILNP 16.ACEGJLNP 24.ACEHILNP 32.ACEHJLNP

ONLY 224 to go!

SUMMARY: As you increase the size of the tournament, the number of permutations grows exponentially. By the time you look at the real NCAA bracket of 64 teams, you will have 4,294,967,296 different ways of filling out the first round alone! The good news is that the rest of the tournament is a walk in the park if you can just get those first 32 games exactly right. If you do somehow beat the odds in that first round, it gets a lot easier in the second round. There is just a 1 in 65,000 chance of accurately predicting the Sweet 16; no sweat compared to 4.2 billion. The table below shows how guessing all 32 games correctly in the first round is almost as hard as guessing the whole bracket:

NCAA ROUND BY ROUND PERMUTATIONS

Teams ------------- No. of Games --------- No. of Permutations
2 (Championship) ------- 1 --------------------- 2
4 (Final Four) ------------ 2 --------------------- 4
8 (Elite Eight) ----------- 4 --------------------- 16
16 (Sweet Sixteen) ----- 8 --------------------- 256
32 (Second Round) ----- 16 -------------------- 65,536
64 (First Round) -------- 32 -------------------- 4,294,967,296

THE SUM OF ALL 6 ROUNDS = TOTAL MARCH MADNESS PERMUTATIONS: 4,295,033,110

Although 4.2 billion is a huge number, the odds significantly improve when you consider the fact that a #16 seed has never beat a #1 seed in the first round. So by using history to forecast 4 of the first 32 first round games, there are only 28 outcomes up in the air. Amazingly, you have actually removed over 4 billion possibilities from the first round just by eliminating these 4 outcomes from the equation!

How is this possible? Let's go back to the earlier example of an 8-team bracket with 4 first round games. Again, each letter represents who would win each of the 4 games:

1.ACEG 5.ADEG 9.BCEG 13.BDEG
2.ACEH 6.ADEH 10.BCEH 14.BDEH
3.ACFG 7.ADFG 11.BCFG 15.BDFG
4.ACFH 8.ADFH 12.BCFH 16.BDFH


Let's say that some #1 seed like Memphis or North Carolina was 'Team A' and they were matched up against an ill-fated #16 seed represented here by 'Team B'. If you knew for sure that Team A was going to win, you could eliminate the 8 permutations that have Team B winning, or half of the 16 possibilities listed above. In other words, you cut the number of permutations in half with each outcome you eliminate!

Let's apply this to the real-life NCAA first round which involves 32 games. To go from 32 down to 28 games, we will divide 4,294,967,296 in half 4 times (once for each of the doomed #16 seeds):

4,294,967,296 ÷ 2 = 2,147,483,648
2,147,483,648 ÷ 2 = 1,073,741,824
1,073,741,824 ÷ 2 = 536,870,912
536,870,912 ÷ 2 = 268,435, 456
TOTAL ------------ 4,026,531,840 permutations lost by forecasting losses by the #16 seeds

4,294,967,296 - 4,026,531,840 = 268,435,456 first round permutations with 28 winners

When you add 268,435,456 with 65,814 possibilities for the remaining rounds, the new total is 268,501,270 total March Madness permutations- a reduction of over 4 billion!!

SUMMARY: When you eliminate the possibility of any #1 seeds losing in the first round, the odds are roughly 1 in 268 million for guessing the entire bracket correctly.

You may also be wondering to yourself (or not), "If it's basically impossible to get the whole first round correct, then why is it that people can regularly predict the Final Four accurately?" There are several reasons for this:

1) In order to guess the Final Four correctly, you only need to guess (at minimum) 16 of the first 60 games correctly (4 wins for each of the 4 teams that make it). In other words, it's possible to guess wrong 70% of the time and still get the Final Four correct. Of course, you could also guess 98% correctly and still NOT get the Final Four correct if the one game you missed involved a Final Four team. Still, 16 out of 60 (27%) is a lot easier than 60 out of 60.

2) To guess the Final Four correctly, you don't need to get more than 50% correct in any of the first 4 rounds of the tournament. You only need (at minimum) 1/16th of the 1st round, 1/8th of the second round, 1/4th of the Sweet 16 and 1/2 of the Elite 8. Difficult, but not impossible.

3) Over half of the time (12 out of the last 23 years), all of the Final Four teams have been seeded no worse than #4. In other words, more often than not, you could have picked the correct 4 teams from a pool of just 16 teams (those seeded 1 through 4). Since the current 64-team format began in 1985, there have been only 11 final four teams (out of 92 possible) seeded worse than #4. That means that over 89% of the time, you can safely make your Final Four predictions from those top 16 teams.

4) If you narrow the options down to the top 4 seeds in each regional, the mathematical probability of accurately predicting the Final Four is only 1 in 256, which are much better odds than the 268 million to 1 chance of guessing the whole bracket correctly. How can that be? Again, here's the math:

Using only the #1 and #2 seeds, there are 16 possible Final Four scenarios:

1111 1211 2221 2121
1112 1212 2222 2122
1121 1221 2211 2111
1122 1222 2212 2112

If you add in the #3 seeds, there are an additional 63 Final Four permutations:

1113 1311 2131 2321 3111 3232 3333
1131 1313 2132 2312 3122 3233 3331
1133 1331 2113 2323 3113 3222 3311
1123 1333 2123 2332 3131 3223 3313
1132 1321 2133 2333 3123 3211 3312
1231 1322 2213 2331 3132 3212 3321
1232 1323 2231 2313 3112 3221 3332
1233 1312 2223 2311 3121 3213 3323
1213 1332 2232 2322 3133 3231 3322
1223 2233

65 + 16 = 81 total permutations for seeds #1 through #3

Based on the pattern above, we can calculate the number of possible Final Four scenarios depending on the number of seeds included by multiplying to the 4th power.

No. of seeds Multiplied to the 4th power = Total number of Final Four permutations
1 ------------- 1
2 ------------- 16
3 ------------- 81
4 ------------- 256
5 ------------- 625
6 ------------- 1,296
7 ------------- 2,401
8 ------------- 4,096
9 ------------- 6,561
10 ----------- 10,000
11 ----------- 14,641
12 ----------- 20,736
13 ----------- 28,561
14 ----------- 38,416
15 ----------- 50,625
16 ----------- 65,536

As you can see, the permutations are much smaller for predicting the Final Four compared to correctly filling out the whole bracket. Even if you allow the possibility of a #11 seed reaching the Final Four, (as George Mason did in 2006) there are only about 20,000 possible Final Four scenarios compared to roughly 268 million bracket permutations!

But before you begin to think that there is a method to this madness, I offer this final word of caution. Although #1 seeds are the most likely to make the Final Four, there has never been a case when all four #1 seeds did this- 3 out of 4 is the most ever, which has happened 3 different times. Two years ago in 2006, none of the #1 seeds survived to Big Dance's final weekend. Champion Florida was a #3, runner up UCLA was a #2, LSU was a #4 and George Mason was a #11 seed. It was just another typically unpredictable year. Last year, finalists Florida and Ohio State were both #1s, while UCLA and Georgetown were both #2 seeds.

For mathematical and other reasons, this tournament is my absolute favorite sporting event of the year. As you fearlessly complete your 2008 bracket, your best bet might just be to ignore the advice of the so-called "experts" and pick a few carefully chosen upsets- but not too carefully! Let the madness begin.

10 comments:

Unknown said...

ok, you just alienated half your viewing audience!! Too much math & sports! Where is the nuance, the narrative?

Unknown said...

on the plus side, I find you fabulously sexy.

dugdathug said...

You should have multiplied rather than added.... the real number of permutations is much much higher than you calculate.

Dan Stringer said...

How so?

Anonymous said...

boooooooooooooooooooooooooooooooooo

Anonymous said...

2^63 = 9 quintillion and lots of change. 2 possible outcomes for each game raised to the number of games in a 64 team tournament.

Anonymous said...

Old, I know, but to be clear as this came up early in a google search...

Anonymous' 9 quintillion argument is invalid as it assumes a team could lose in the first round and then win in the second round.

The original post is correct:
4,295,033,110 possible ways to fill out a bracket give:
A) It is pre-populated by the selection committee (it is);
and
B) It is one-and-done (it is).

Jon said...

"Although 4.2 billion is a huge number, the odds significantly improve when you consider the fact that a #16 seed has never beat a #1 seed in the first round. So by using history to forecast 4 of the first 32 first round games, there are only 28 outcomes up in the air. Amazingly, you have actually removed over 4 billion possibilities from the first round just by eliminating these 4 outcomes from the equation!"

So much for your 2018 bracket!

Mike said...
This comment has been removed by the author.
Mike said...

Since this post came up at the top of my google search results, it seems worth setting the matter straight.

dugdathug has the right idea: multiplication of permutations rather than summation.

Yes, there are 4,294,967,296 permutations in the first round of 32 games. In the second round, however, there are 65,536 permutations FOR EACH ONE of the 4,294,967,296 permutations from the first round, therefore 65,536 x 4,294,967,296 = 281,474,976,710,656 permutations from the first two rounds combined.

This is the same as 2^16 * 2^32 = 2^48 permutations for the 48 games played through two rounds.

The same method can be applied through the end of the bracket, a total of 63 games, for a total number of permutations of 2^63 = 9,223,372,036,854,775,808 .

The assertion that this math allows a losing team to re-enter IS NOT correct. Doubters, please see this post http://mathforum.org/library/drmath/view/56223.html