Particular Ways to deal with Crossword Bewilders and Other Dialect Recreations - PowerPoint PPT Presentation

modular approaches to crossword puzzles and other language games n.
Skip this Video
Loading SlideShow in 5 Seconds..
Particular Ways to deal with Crossword Bewilders and Other Dialect Recreations PowerPoint Presentation
Particular Ways to deal with Crossword Bewilders and Other Dialect Recreations

play fullscreen
1 / 50
Download
Download Presentation

Particular Ways to deal with Crossword Bewilders and Other Dialect Recreations

Presentation Transcript

  1. Modular Approaches toCrossword Puzzles and Other Language Games Michael L. Littman Rutgers University mlittman@cs.rutgers.edu Crosswords and Constraint Satisfaction

  2. Motivation Software that can understand language. • Question answering • Constructing databases from text • Natural-language interaction • Automatic summarization/briefing How can programs represent meaning? Crosswords and Constraint Satisfaction

  3. Traditional Approach ~like(me, bananas) ^ respect(me, x situation(x) ^ humorous(bananas,x)) I don’t really like bananas, but have long respected their humorous potential. Semantics tricky; word meanings informal. Behavioral approach: do something with it. Crosswords and Constraint Satisfaction

  4. Language Games Like other games: • Evaluation process clean. • Challenging (and fun!) for people. Unlike logical games: • Meaning matters! • No closed world assumption; messy. • Learning necessary... moving target. Machine performance far from humans’. Crosswords and Constraint Satisfaction

  5. Word Games Super-human performance common: • Scrabble™: Maven, near-perfect (Sheppard 02) • Boggle™: millisecond solutions (Boyan 98) • Hangman (Littman 00) • 99.97% 9-letter words under 5 guesses • 1.35 misses on average Crosswords and Constraint Satisfaction

  6. Trivial Pursuit™ Race around board, answer questions. Categories: Geography, Entertainment, History, Literature, Science, Sports Crosswords and Constraint Satisfaction

  7. Wigwam QA via AQUA (Abney et al. 00) • back off: word match in order helps score. • “When was Amelia Earhart's last flight?” • 1937, 1897 (birth), 1997 (reenactment) • Named entities only, 100G of web pages Move selection via MDP (Littman 00) • Estimate category accuracy. • Minimize expected turns to finish. Crosswords and Constraint Satisfaction

  8. Modular Approach to QA High-performance question answering system uses a variety of approaches: • huge corpus of text on many topics • database of questions and answers • tables of facts • combines multiple extraction methods Meaning has many faces Crosswords and Constraint Satisfaction

  9. Wigwam’s Knowledge wigwam me trivia web arts & literature .3 .6 .6 .9 entertainment .3 .3 .5 .9 science & nature .2 .7 .7 .7 geography .1 .2 .4 .9 history .1 .2 .5 .9 sports & leisure .025 .6 .7 .4 ~turns/game 414 48 22 8 Crosswords and Constraint Satisfaction

  10. Who Wants to Be a Millionaire “You know, we ought to enter her in one of those TV quiz shows. We could make a fortune.” (Danny Dunn in Williams & Abrashkin 58) Mult. choice questions, increasing difficulty • 100, 200, 300, 500, 1000 • 2000, 4000, 8000, 16000, 32000 • 64000, 125000, 250000, 500000, 1000000 Crosswords and Constraint Satisfaction

  11. Question Answering Approach Choose highest ranked choice. • 75%, 68%, 56% (Clarke, Cormack & Lynam 01) Expected value (always go on): • $3,689 • Most value due to (rare) $1M. People: • $97,357 Crosswords and Constraint Satisfaction

  12. The Humble Crossword Crosswords and Constraint Satisfaction

  13. NYT, Saturday, October 10th, 1998 Crosswords and Constraint Satisfaction

  14. Variety of Clue Types ThesaurusCut off _ _ _ _ _ _ _ _ Puns & WordplayMonk’s head? _ _ _ _ _ Arts & Literature“Foundation Trilogy” author _ _ _ _ _ _ Popular CulturePal of Pooh _ _ _ _ _ _ EncyclopedicMountain known locally as Chomolungma _ _ _ _ _ _ _ CrosswordeseKind of coal or coat _ _ _ ISOLATED ABBOT ASIMOV TIGGER EVE REST PEA Crosswords and Constraint Satisfaction

  15. PROVERB: System Design Candidate generation(Keim et al. 99) • Like information retrieval: clue implies target • Variety of approaches used simultaneously Merging • Like meta search engine: create master list Grid filling(Shazeer et al. 99) • Like constraint satisfaction: fit answer to grid Probabilities are the common language Crosswords and Constraint Satisfaction

  16. PROVERB System Architecture Crosswords and Constraint Satisfaction

  17. Modules: ClueDB Nymph pursuer: SATYR Bugs pursuer: ELMER Nymph chaser: SATYR Place for an ace: SLEEVE Highball ingredient: RYEHighball ingredient: ICE exact: Highball ingredient: RYE partial: Ace place?: SLEEVE TransModule: Bugs chaser: ELMER AlsoDijkstra[1-2], d[1-2]c, lsicwdb X chaser Xpursuer Crosswords and Constraint Satisfaction

  18. Available at .com ClueDB Comparison exact TransModule partial Coverage 40.3% 73.0% 92.6% Accuracy 91.4% 79.8% 71.0% # Returned 1.3 1.5 493.0 Crosswords and Constraint Satisfaction

  19. Modules: Other DBs Database modules: Transform clue to DB query. imdb: Warner of Hollywood: OLAND wordnet: Fruitless: ARID Syntactic: Variations of fill-in-the-blanks. also blanks_{books, geo, movies, music, quotes}, kindof blanks_movies: “Heavens ____!”: ABOVE Web search: Not used in experimental system. google: “The Way To Natural Beauty” author, 1980: Also rogetsyns, geo, writers, compass, myth, TIEGS altavista, yahoo, infoseek, EbModule, lsiency, etc. Crosswords and Constraint Satisfaction

  20. Modules: Backstops Word lists: Ignore clue, return all words. wordList: 10,000 words, perhaps: NOVELETTE Implicit modules: Probability distributions over all strings of words (e.g., bigram). segmenter: 1934 Hall and Nordhoff adventure novel: PITCAIRNISLAND AlsobigWordList, wordList, DbList Crosswords and Constraint Satisfaction

  21. Merging Candidate Lists Crosswords and Constraint Satisfaction

  22. Module Performance Crosswords and Constraint Satisfaction

  23. CSPs Constraint satisfaction is a core CS task. Apps: • planning and scheduling • design • vision • natural language understanding • temporal reasoning • protocol verification Crossword puzzles the poster child. Crosswords and Constraint Satisfaction

  24. Grid Filling and CSPs Crosswords and Constraint Satisfaction

  25. CSPs and IR Domain from ranked candidate list? Tortellini topping: TRATORIA, COUSCOUS, SEMOLINA, PARMESAN, RIGATONI, PLATEFUL, FORDLTDS, SCOTTIES, ASPIRINS, MACARONI, FROSTING, RYEBREAD, STREUSEL, LASAGNAS, GRIFTERS, BAKERIES,… MARINARA, REDMEATS, VESUVIUS, … Standard recall/precision tradeoff. Crosswords and Constraint Satisfaction

  26. Probabilities to the Rescue? Annotate domain with the bias. Crosswords and Constraint Satisfaction

  27. Solution Probability Proportional to the product of the probability of the individual choices. Can pick sol’n with maximum probability. Maximizes prob. of whole puzzle correct. Won’t maximize number of words correct. Crosswords and Constraint Satisfaction

  28. Posterior Score Posterior probability of a candidate in a slot is sum of the solution probabilities. Crosswords and Constraint Satisfaction

  29. Maximum Expected Overlap Max words in common with random sol’n. Q: expected overlap qxi: prob. of word in slot i. PP-complete. Equivalent to stochastic satisfiability. Crosswords and Constraint Satisfaction

  30. Fast Approximation Compute exact posterior quickly on trees. • Only consider slots reachable in d steps. • Assume independence of paths (tree). Increase d iteratively, improve approx. Cache intermediate results (DP). Polytime! Crosswords and Constraint Satisfaction

  31. Connection to Turbo Decoding Turbo decoding (loopy belief propagation). • transmitting messages from deep space • true message = crossword solution • double encoding = across/down clues • corruption = answer uncertainty • 4-cycles Same decoding algorithm in use! Crosswords and Constraint Satisfaction

  32. Artificial Problems Random on 5x5 grids. Improves with d. NYT: 52% to 90%. Crosswords and Constraint Satisfaction

  33. Grid Filling Crosswords and Constraint Satisfaction

  34. Grid Filling Crosswords and Constraint Satisfaction

  35. Grid Filling Crosswords and Constraint Satisfaction

  36. Grid Filling Crosswords and Constraint Satisfaction

  37. Grid Filling Crosswords and Constraint Satisfaction

  38. Grid Filling Crosswords and Constraint Satisfaction

  39. Grid Filling Crosswords and Constraint Satisfaction

  40. Grid Filling Crosswords and Constraint Satisfaction

  41. Grid Filling Crosswords and Constraint Satisfaction

  42. Grid Filling Crosswords and Constraint Satisfaction

  43. Grid Filling Crosswords and Constraint Satisfaction

  44. Final: 88% words, 97% letters Crosswords and Constraint Satisfaction

  45. PROVERB Results Test collection (370 puzzles, @~15 min.) • 95% words, 98% letters, 46% puzzles • NYT: 89.5% (95.5% MTW, 85.0% TFSS) • Ablation: ClueDB only 88%, no ClueDB 27% American Crossword Puzzle Tournament • 1998: 190/251, 80% words (vs. 100%) • tricks: letter pairs, words in single square • 1999: 147/261, 75% words • tricks: Home is near: ALASKA Crosswords and Constraint Satisfaction

  46. TOEFL Synonyms Used in college applications. fish • scale • angle • swim • dredge Crosswords and Constraint Satisfaction

  47. Synonym Approaches Latent Semantic Indexing (Landauer & Dumais 97) • Analyze 30k paragraphs, 300d embedding • 64% ~Boulder http://lsa.colorado.edu/ Pointwise Mutual Information-IR (Turney 01) • Counts in Altavista (350M); 74% (us: 77%) Thesaurus • http://Wordsmyth.net: 98% prec.; 74% cov. • Combine with PMI-IR: 92% Crosswords and Constraint Satisfaction

  48. Verbal Analogies Used in college boards (SATs, GREs), and as an intelligence test. cat : meow :: • mouse : scamper • bird : peck • dog : bark • horse : groom • lion : scratch Crosswords and Constraint Satisfaction

  49. Wrap Up Modular language-game systems. PROVERB: • Human-competitive performance. • Components theoretically motivated. • Probabilistically grounded. Crosswords and Constraint Satisfaction

  50. What’s Next? Better module merging: Much work has been ad hoc. Now evaluating a probabilistic combining rule. RL: Corpus-based approach to behavior. Recognize how similarities to past experience. Meaning comes from experience, not rules. Crosswords and Constraint Satisfaction