Humanity Lost on Jeopardy!

Wednesday, February 16, 2011
By dreeves

I can't allow you to endanger the mission, Dave.

BREAKING: IBM’s question answering computer, Watson, defeats Jeopardy champions Ken Jennings and Brad Rutter!

Not to detract from IBM’s achievement but I’m disappointed by the buzzer aspect of this. Background: Watson gets the question (or “clue” or “answer” as they call it on Jeopardy) as plain text at the same moment that the humans see it on the screen. But no one can buzz in until the host finishes reading the question. At the moment he finishes, a light comes on indicating that the buzzers are activated. If you buzz in before that then you’re locked out for a fraction of a second.

Watson has an unfair advantage in reaction time and Ken and Brad have an unfair advantage in timing their button presses.

Here’s the problem: Watson has superhuman reaction time. (It buzzes in with a mechanical thumb and this introduces a tiny delay but less than the fastest possible human reaction time.) It’s not necessarily an unfair advantage because the humans have their own related unfair advantage, namely, they can anticipate the moment the buzzers are activated (since only they can hear Trebek) and can buzz in faster than even Watson’s reaction time. By accepting a risk of getting locked out, they can, with seemingly small probability, beat Watson to the buzzer. [1] To be clear, with the current rules, Watson has an unfair (in terms of what the competition is really about) advantage in reaction time and Ken and Brad have an unfair advantage in hearing Trebek speak, and thus being able to time their button press to beat Watson’s reaction time (rather rarely with success, it turns out). It would be quite a coincidence if those advantages perfectly canceled out and yielded a fair game. [2]

What’s really needed is a rule tweak: Everyone who buzzes in within the first fraction of a second after the buzzers are activated is considered to have buzzed in simultaneously and the tie is broken randomly. Or, simpler, just eliminate the lock-out — buzzing in before the buzzers are activated is just treated as buzzing in at the exact moment the buzzers are activated.

Humans are vastly better at understanding what is being asked and vastly worse at knowing the answer.

But enough of the buzzer nitty-gritty. There’s a more important asymmetry: Humans are vastly better at actually understanding what is being asked, and vastly worse at knowing the answer when Watson is lucky enough to understand the question. In other words, Watson almost always has the answer in its database so if it parses the question right, it should get it. For the humans it’s the opposite. They always parse the question right but don’t always know the answer.

Put Watson and a human (any human) together and you really have a quantum leap forward in question answering. The human, for example, would nix Watson’s pathetic “Toronto” as an answer to a “What U.S. city…” question or “Picasso” for “What art period…”.

My verdict: This is super impressive and will be super useful. It might not even be too hyperbolic to anticipate this saving lives or otherwise making the world awesomer. Still, in terms of true natural language understanding — having a normal conversation with a computer — this seems to be pretty minuscule progress.

But that’s actually good news for me. Three years ago I made a bet with Anna Salamon about whether computers will pass the Turing test by 2018. I will lose $10,000 if they do and win $100 if they don’t. (For terms we piggybacked on the inaugural longbets.org bet.) After watching Watson’s performance (admittedly on the edge of my seat) I’m as sure as ever that my money is safe.

Thanks to Anna Salamon and Lev Reyzin for the discussion that led to this post.

Illustration by Kelly Savage.

Related Reading/Viewing

UPDATE:

Footnotes

[1] Conceivably it never actually happened that the humans beat Watson to the buzzer and they only ever beat Watson when it hadn’t quite found its answer in time. If so, the game was really skewed in Watson’s favor. But probably it did happen that the humans beat it to the buzzer occasionally. Perhaps IBM will set the record straight on this.

[2] Speaking of quasi-unfair advantages, there are two other factors that sullied the competition:

  1. Watson doesn’t get to hear the other contestant’s answers. Though it’s not clear how unfair that is. Even if it had ears, ie, a microphone, for listening to the other answers, it’s not clear it could make heads or tails of what it was hearing and avoid its silly blunder of repeating a wrong answer.
  2. It’s a bit lame that Watson knows the historical probabilities of where the Daily Doubles are, updated on the state of the board, even. In other words, it’s doing fancy machine learning on something completely unrelated to question answering. That really calls for a (trivial) change in the game: just use fair randomization for daily double placement!

Tags: , , ,

  • Bryce

    Re: footnote #2, Brad and Ken know that probability too. It was a huge part of their advantage over lesser Jeopardy competitors when they played against humans. The hunting-for-daily-doubles board-clearance pattern that we saw during the IBM chalenge wasn’t really indicative of a computer being in the game so much as it was indicative of a very high-level game of Jeopardy.

  • http://dreev.es dreeves

    @Bryce, good point, but we expect Watson to have the edge in that aspect of the game, right? And I think even our ML friends will have to agree it’s an uninteresting aspect of the game. Anyway, it’s minor compared to the buzzer aspect (hence only a footnote) but it detracts slightly from the coolness of Watson’s victory.

    Btw, I lost yet another $20 for a typo in this post. Unclosed parenthesis. What I get for not composing in Emacs.

  • http://www.levreyzin.com Lev Reyzin

    I’ve become convinced you’re right about the buzzer. As Ken Jennings notes (http://is.gd/kpuyKc) and Scott Aaronson writes (http://www.scottaaronson.com/blog/?p=550), this game was a repeated demonstration that computers are faster than humans, but not that they are better at trivia. While I don’t know if the advantage was “unfair,” it certainly made the contest less illustrative.

  • http://www.twitter.com/theblackgecko Cody L. Custis

    I must concur with Jennings conclusion that Watson had an unfair advantage with buzzers. However, that can be easily fixed for a rematch.

    The real challenge isn’t when an upgraded version of Watson (one that has to process speech) wins a Jeopardy round with a fair buzzer system. It’s not even when the upgraded Watson is able to parse away Alex Trebek’s banter to get to questions, and learn from humans when the answers are wrong. The challenge is when the upgraded version of Watson goes on David Letterman the next day to talk about the victory.

  • http://www.consultingstatisitcs.org Basil

    I’m so glad you posted on this! I watch Jeopardy somewhat regularly and was slightly upset at the computer victory. I was most curious to how they were making the buzz times fair, but now I know they didn’t. The other aspect that was totally unfair was the input of the question. I believe the only fair way to make the computer read the question is visually and verbally like the audience and other participants. The computer could easily be programmed to do this and cross validate its visual predictions with its verbal predictions of the question.

  • http://mikekr.blogspot.com zbicyclist

    The real losers were the people who work in call centers and similar jobs.

    It’s easy to see that Watson could understand my issues substantially better than the average “customer service” rep, and be much more consistent in applying corporate policy.

  • http://blog.oddhead.com Dave

    I agree the buzzer was unfair. However Watson still had to be good. Answering incorrectly on jeopardy means losing points. So Watson had to be accurate and even more difficult, know when it is accurate.