Humanity Lost on Jeopardy!

Wednesday, February 16, 2011
By dreeves

I can't allow you to endanger the mission, Dave.

BREAKING: IBM’s question answering computer, Watson, defeats Jeopardy champions Ken Jennings and Brad Rutter!

Not to detract from IBM’s achievement but I’m disappointed by the buzzer aspect of this. Background: Watson gets the question (or “clue” or “answer” as they call it on Jeopardy) as plain text at the same moment that the humans see it on the screen. But no one can buzz in until the host finishes reading the question. At the moment he finishes, a light comes on indicating that the buzzers are activated. If you buzz in before that then you’re locked out for a fraction of a second.

Watson has an unfair advantage in reaction time and Ken and Brad have an unfair advantage in timing their button presses.

Here’s the problem: Watson has superhuman reaction time. (It buzzes in with a mechanical thumb and this introduces a tiny delay but less than the fastest possible human reaction time.) It’s not necessarily an unfair advantage because the humans have their own related unfair advantage, namely, they can anticipate the moment the buzzers are activated (since only they can hear Trebek) and can buzz in faster than even Watson’s reaction time. By accepting a risk of getting locked out, they can, with seemingly small probability, beat Watson to the buzzer. [1] To be clear, with the current rules, Watson has an unfair (in terms of what the competition is really about) advantage in reaction time and Ken and Brad have an unfair advantage in hearing Trebek speak, and thus being able to time their button press to beat Watson’s reaction time (rather rarely with success, it turns out). It would be quite a coincidence if those advantages perfectly canceled out and yielded a fair game. [2]

What’s really needed is a rule tweak: Everyone who buzzes in within the first fraction of a second after the buzzers are activated is considered to have buzzed in simultaneously and the tie is broken randomly. Or, simpler, just eliminate the lock-out — buzzing in before the buzzers are activated is just treated as buzzing in at the exact moment the buzzers are activated.

Humans are vastly better at understanding what is being asked and vastly worse at knowing the answer.

But enough of the buzzer nitty-gritty. There’s a more important asymmetry: Humans are vastly better at actually understanding what is being asked, and vastly worse at knowing the answer when Watson is lucky enough to understand the question. In other words, Watson almost always has the answer in its database so if it parses the question right, it should get it. For the humans it’s the opposite. They always parse the question right but don’t always know the answer.

Put Watson and a human (any human) together and you really have a quantum leap forward in question answering. The human, for example, would nix Watson’s pathetic “Toronto” as an answer to a “What U.S. city…” question or “Picasso” for “What art period…”.

My verdict: This is super impressive and will be super useful. It might not even be too hyperbolic to anticipate this saving lives or otherwise making the world awesomer. Still, in terms of true natural language understanding — having a normal conversation with a computer — this seems to be pretty minuscule progress.

But that’s actually good news for me. Three years ago I made a bet with Anna Salamon about whether computers will pass the Turing test by 2018. I will lose $10,000 if they do and win $100 if they don’t. (For terms we piggybacked on the inaugural longbets.org bet.) After watching Watson’s performance (admittedly on the edge of my seat) I’m as sure as ever that my money is safe.

Thanks to Anna Salamon and Lev Reyzin for the discussion that led to this post.

Illustration by Kelly Savage.

Related Reading/Viewing

UPDATE:

Footnotes

[1] Conceivably it never actually happened that the humans beat Watson to the buzzer and they only ever beat Watson when it hadn’t quite found its answer in time. If so, the game was really skewed in Watson’s favor. But probably it did happen that the humans beat it to the buzzer occasionally. Perhaps IBM will set the record straight on this.

[2] Speaking of quasi-unfair advantages, there are two other factors that sullied the competition:

  1. Watson doesn’t get to hear the other contestant’s answers. Though it’s not clear how unfair that is. Even if it had ears, ie, a microphone, for listening to the other answers, it’s not clear it could make heads or tails of what it was hearing and avoid its silly blunder of repeating a wrong answer.
  2. It’s a bit lame that Watson knows the historical probabilities of where the Daily Doubles are, updated on the state of the board, even. In other words, it’s doing fancy machine learning on something completely unrelated to question answering. That really calls for a (trivial) change in the game: just use fair randomization for daily double placement!

Tags: , , ,