I felt like writing something else this week. As it happens, no episodes of *War of the Worlds* aired during March of 1989 or 1990, and I’d planned to insert a gap of a few weeks in my own schedule there, to cover things like the Asylum movie and *ALF* and maybe the final episode of *Webster*. But there’s two more episodes of season one and one more of season two before that happens, and I feel like digressing now instead.

The Monty Hall Problem is a probability puzzle that’s in a class of problems that are unusual in just how ridiculously hard it is to make your brain believe the right answer despite how simple it is. When columnist Marilyn vos Savant presented the correct solution (With the caveat that the *specific phrasing* of the puzzle she used contained a significant point of ambiguity) to the puzzle in a 1990 column for *Parade* magazine, she was inundated with thousands of letters — many from PhDs — insisting she was wrong. Many of the writers were real assholes about it. Ridiculously famous mathematician Paul Erdős, famously, refused to believe the correct answer, even with formal mathematical proof, until he saw the result verified in computer simulation. I happened to overhear some people discussing the puzzle recently, and as usual, someone got it wrong and could not be convinced of the right answer.

The puzzle is this:

You are a contestant on $FAMOUS_GAME_SHOW. The host, $WELL_KNOWN_TELEVISION_PERSONALITY presents you with a choice of three doors. Behind one of these doors is a BRAND NEW CAR(!), while behind the other two are goats. You are asked to select one of the three doors. After you have made your selection, the host opens one of the unselected doors, revealing a goat. He offers you a choice: open the door you initially selected and claim the prize behind it, or switch your selection to the remaining closed door. What course of action should you choose?

If you haven’t already heard this one, you almost certainly think that it makes no difference whether or not you switch: the car is equally likely to be behind either door. But given that, whether or not you possess the comparatively rare gift of *statistical* literacy, you almost certainly possess basic *English* literacy to have worked out that if that were the answer, this would hardly be a famous brain teaser.

The answer, of course, is that you should always switch. The *probability* that switching is the right strategy depends on certain assumptions not present in the way I phrased the question, but there’s actually only one non-standard assumption you can make where switching is a bad idea (The one where Monty is an asshole and wouldn’t have opened the door had your initial choice been incorrect). Well, okay, switching is also a bad idea in the non-standard assumption that you’d rather win a goat than a car. Fair enough. My family already owns two cars, but zero goats. Some people think that clarifying the standard assumptions also makes the analysis more intuitive, but I don’t: the assumptions primarily impact the question “What’s the best general strategy on this game show?” rather than the question-as-asked, “What do you, the player, do next?”

These assumptions, for clarity, are:

- Monty
*always* opens a door and offers a switch
- Monty
*never* reveals the car when he opens a door
- If the player’s initial choice is the car, Monty chooses the door he will open at random

That third one in particular I think just makes things more complicated. It’s necessary to close a loophole where you might be able to deduce some extra information from which door Monty opens: if Monty has a preference for lower-numbered doors, then Monty opening door number three means you should absolutely switch — but, crucially, you should still switch even if he doesn’t. The other two conditions don’t actually apply in the specific case specified, since Monty *did* open a door and *didn’t* reveal a car. Removing them can reduce the probability that you should switch, but not below 50:50.

That’s why those assumptions are so often claimed to make the problem easier to understand. But the thousand letters to Marilyn don’t bear this out: most of the letter-writers intuited those constraints but *still* got it wrong. Marilyn herself suggested that the problem would be more intuitive if you cranked the numbers up: make it a million doors, and have Monty open 999,998 of them. Do that, and the magical power of big numbers makes your brain change the question it’s asking from “Which door is the car behind?” to “Did I choose right the first time?”. But it turns out that when you increase the numbers, people *are* more likely to switch, *but* they still don’t believe the probability changes.

I’ll spare you the formal proof of the answer for the moment. The reason I’m writing this is that I wanted to share with you how *I* came to believe the answer. The *math* of the answer, I got right away. The problem was, it didn’t *feel* right. It felt like a glitch in the Matrix, or like Scotty’s solution to the Kobyashi Maru: a weird math thing that didn’t map to reality, like imaginary numbers. Or real numbers, for that matter.

I had a hard time explaining this to people. And by “people” here, I’m taking some liberties, because I mostly mean “mathematicians”. They couldn’t read it any way other than, “I don’t understand the math.” But I understood the math. What I couldn’t understand was reality.

But that million-door variant of the problem reminds me of another probability question I used to have a hard time with. The flipped coin question. Imagine you have a fair coin, and you flip it nine-hundred-ninety-nine thousand, nine-hundred-ninety-nine times, and it comes up heads each time. What are the odds that it’ll come up heads on the next flip?

I find it *completely* intuitive that the answer is ½. And when you phrase it the way I did, I *hope* most people agree. But there’s a whole family of similar questions where the math is exactly the same, but people will fall prey to a gambler’s fallacy and become convinced that the coin is *due*. It’s all about the law of averages: a tossed coin coming up heads a million times in a row is *astronomically* unlikely. One more heads flip, and it will happen. Therefore, the odds of getting that last heads flip must *also* be astronomical.

That’s ridiculous, of course. But the usual explanation for *why* it’s ridiculous never felt quite satisfying to me: “Because how could it not be 50:50? It’s not like those past coin flips have some magical power to change the probability of the next one.” It’s *true*, but it doesn’t address the “law of averages” thing that makes it feel wrong.

The answer that satisfied me is this: your belief that getting a million heads in a row is astronomically unlikely is *arbitrarily privileging* the number “a million”. Because it’s **big and round**. Because you know what *else* is astronomically unlikely? Flipping heads nine-hundred-ninety-nine thousand, nine-hundred-ninety-nine times in a row. And you’ve already done that. The punchline is this: how much *more* astronomical are the odds of a million heads compared to the odds of nine-hundred-ninety-nine thousand, nine-hundred-ninety-nine? Exactly twice. The way I look at it is this way: you’ve *already used up* most of the “unlikely” getting the first nine-hundred-ninety-nine thousand, nine-hundred-ninety-nine heads. And the “amount” of “unlikely” you’ve got left to get to a million is *exactly fifty-fifty*.

Credible mathematicians would say I’m being obtuse. It’s fifty-fifty because each coin toss is an independent event with independent probability and that’s the end of it. But there’s the rub, right? *Why* is each coin toss “independent”? *Mathematically*, each coin toss is independent because the probability of flipping heads on flip N+1 is the same regardless of the outcome of flip N. But this is just begging the question (And for possibly the only time in my life, I think I am using the phrase “begging the question” in its pedantically correct sense): the odds are the same because the flips are independent, and the flips are independent because the odds are the same. Remove the abstraction of mathematics, and you get at a more reasonable explanation: the flips are independent because a coin, being a slug of metal, neither remembers nor cares what the outcome of the last flip was.

What always made me uncomfortable, then, was the *context switch*: the simple answer can be expressed as elegant, pure, abstract mathematics, right up until you get to the bit where we simply have to take for granted that coin tosses are independent events. It just feels *wrong* for math to throw up its hands at that point and say, “*Because it’s a ***coin**, dumbass!” Especially when you frame it in a context like, “We’re one flip away from a million heads in a row,” because “A million heads in a row” sure doesn’t *sound* like an independent probability. On account of it isn’t: P(one million) = 2^{-1,000,000}; P(one million|999,999)=2^{-1}.

And indeed, once I made the mental switch to the notion of having “already used up” the improbability, the whole thing made a kind of actual *sense* to me, rather than just being a series of mathematical formulae that I knew on account of I’d read them in a book and remembered them. That coin isn’t *due*: it whittled away at the improbability slowly, over the course of the previous almost-a-million flips. Put another way, if you do the coin-flip experiment in 2^{1,000,000} parallel universes, there’s only one where you get heads a million times, and many where you don’t — but if you know that you got the first 999,999, then *you aren’t in most of those universes*: you’re in one of *two* (Or maybe both of two. Quantum mechanics is weird).You can apply that paradigm to other things as well. Larry Niven once famously declared that an ear of corn was more likely to father a child with Lois Lane than Superman was, meaning that out of all possible universes, a Kryptonian would be sufficiently genetically similar to a human for cross-breeding to be possible in an infinitesimal percentage of them. But this fails to consider that it’s *already* an infinitesimal percentage of said universes where Kryptonians would be sufficiently similar to humans that Lois would even be willing to *try* (You may, as the mood strikes you, include or exclude those universes where Lois is just Into That Sort of Thing). It’s not “tiny number out of nigh-infinite number”, it’s “tiny number out of very-slightly-less-tiny number”.

Let me get back to Monty Hall. I slipped the key observation to the problem in a few paragraphs back without calling attention to it. Did you catch it? People get the Monty Hall problem wrong because they don’t approach the question that was asked. Most people hear the problem and approach it as “Which door is the car behind?” But that’s not the question that was asked: the question was “Should you switch?”, which is to say, “Did you pick the right door on the first try?” The answer to *that* question is very obviously, “Probably not.”

Where I always got hung up was the idea that the solution lay in what you knew when you made your initial choice, versus what you know afterward. People kept trying to explain it to me that way, and it just sent me astray, thinking that the glitch in the Matrix lay somewhere in the vicinity of, “Had I known it *wasn’t* door number 3 when I chose door number 1, I’d have chosen door number 2 instead.” But that’s nonsense.

Here’s the magic thing, and why I don’t think that restating the standard assumptions about Monty’s behavior actually helps people understand the paradox. *What Monty does has ***no impact** on the input to the question. Monty *isn’t* giving you actual new information you can use to make a better guess, as people think. If he were, then it reduces to just making a new guess and this time it’s out of two rather than out of three, and, yeah, the odds are even. But what Monty is actually doing by opening a door is nothing more than *distracting you* from the actual offer he’s made: switch, or not-switch.

Here is when I finally “got it”, when I finally moved beyond understanding the math to actually seeing how the physical reality of the problem worked, and stopped thinking it was some kind of weird Schrödinger’s goat thing. Let’s just remove Monty from the equation altogether:

You are a contestant on $FAMOUS_GAME_SHOW. The host, $WELL_KNOWN_TELEVISION_PERSONALITY presents you with a choice of three doors. Behind one of these doors is a BRAND NEW CAR(!), while behind the other two are goats. You are asked to select one of the three doors. After you have made your selection, the host opens one of the unselected doors, revealing a goat. He offers you a choice: open the door you initially selected and claim the prize behind it, or switch your selection to the remaining closed door bet against yourself: your door will remain closed, and instead, you’ll get what’s behind **both** other doors. What course of action should you choose?

I submit that this variation is mathematically identical to the standard problem (plus or minus one goat). All Monty’s doing when he opens one door is finding a misleading way to explain that your choices are between “Get the car if you picked right initially” and “Get the car if you picked wrong initially”. That’s all you’re being asked. And there’s no variation of the assumptions where what you’re being asked it to “pick again with new information”: you’re only ever being asked whether you think your first guess was right or wrong. In some of the variations, Monty’s behavior might be encoding a hint about the answer to that question, but it’s never changing the question. For all mathematicians like to say that the problem is people not understanding probability, it’s actually a problem of people not understanding *the question*.

Remove the obfuscating detail of Monty opening one door and offering to let you have the other, and suddenly, it all becomes clear: the second choice is not to *pick* a door, but to *reject* one. That’s why switching wins two-thirds of the time.

And hey, worst case, you get *two* goats.