A Hybrid Relational Approach for Word Sense Disambiguation in Machine Translation .

Uploaded on:
A Hybrid Relational Approach for Word Sense Disambiguation in Machine Translation. Lucia Specia Mark Stevenson Maria G. V. Nunes. WSD in Machine Translation (MT). Lexical choice in the case of semantic ambiguity. Examples (English-Portuguese): take = tomar (carry out),
Slide 1

A Hybrid Relational Approach for Word Sense Disambiguation in Machine Translation Lucia Specia Mark Stevenson Maria G. V. Nunes

Slide 2

WSD in Machine Translation (MT) Lexical decision on account of semantic equivocalness. Illustrations (English-Portuguese): take = tomar (complete), levar (lead, coordinate, direct, control), aceitar (acknowledge), pegar (pick, choose), and so on

Slide 3

WSD in Machine Translation (cont.) One of the principle challenges in MT. Clashing results on the helpfulness of WSD for (factual) MT: (Vickrey et al., 2005); (Carpuat and Wu, 2005). Especially for English-Portuguese, thinks about have demonstrated that the absence of WSD modules is one of the primary explanations behind the unacceptable consequences of the existent MT frameworks We proposed that a powerful WSD module, particularly intended for MT, would enhance MT execution.

Slide 4

Approaches to WSD Knowledge-based: phonetic information physically classified or extricated from lexical assets Corpus-based: information consequently procured from content utilizing machine learning Hybrid: blend qualities of the two different methodologies Accurate, however experience the ill effects of the learning securing bottelneck. Wide scope, however require reliable and huge specimen corpus. Investigate points of interest and minimize impediments of different methodologies → wide scope and precise results.

Slide 5

Approaches to multilingual WSD Approaches to WSD as an application-autonomous errand go back to 1960\'s. Most are monolingual, for English disambiguation: WSD is application-subordinate (Wilks and Stevenson, 1998; Kilgarriff, 1997; Resnik and Yarowsky, 1997). WSD for MT varies from monolingual WSD (Hutchins and Sommers, 1992) , especially as for the sense vault (Specia et al., 2006).

Slide 6

Approaches to multilingual WSD Corpus-based and half breed approaches utilize propositional formalisms (trait esteem vectors): Limited expressiveness; information scantiness: Ex1) John gave Mary a major cake. Ex2) Give me something. Results: Impractical to speak to generous information and utilize it amid the learning procedure Hybrid methodologies utilize information in pre-handling ventures, before applying machine learning calculations.

Slide 7

Proposal – a novel approach LeAR (Lexical Ambiguity Resolution): Specific for MT: detects, learning, methods. Half and half - corpus and learning based Several information sources (KSs) naturally obtained from corpus and lexical assets ; Evidence gave by cases of disambiguation removed from consequently made sense labeled corpora . Social formalism Highly expressive, evading information inadequacy: each case is spoken to freely. Inductive Logic Programming (ILP) Relational typical directed learning approach.

Slide 8

Machine Learning Logic Programming ILP Theory (1 st - arrange statements) Aleph Inductive Logic Programming Back. Information (1 st - arrange conditions) Allows the proficient representation of considerable learning about the issue, and permits this learning to be utilized amid the learning procedure (Muggleton, 1991). Illustrations (1 st - arrange statements)

Slide 9

Inductive Logic Programming (cont.) Given: an arrangement of positive and negative cases E = E+  E-a predicate p determining the objective connection to be scholarly learning  of a specific space which indicates which predicates q i can be a piece of the meaning of p . The objective is: to initiate a speculation (or hypothesis) h for p , regarding E and  , which covers the greater part of the E+ , without covering the E-. Also : conditions speaking to K , E , and h must fulfill an arrangement of syntactic confinements S (dialect predisposition). h can be utilized to group new instances of disambiguation.

Slide 10

Inductive Logic Programming (cont.) Aleph (Srinivasan, 2000): Provides an entire social learning derivation motor. Gives different customization alternatives: Induction techniques; Search systems; Evaluation capacities; and so on. We are utilizing: base up inquiry (speculation); non-incremental learning (clump learning); non-intelligent learning (without client mediation); learning in view of positive illustrations as it were.

Slide 11

Inductive Logic Programming (cont.) The default derivation motor prompts a hypothesis iteratively by method for the accompanying strides: One case is chosen to be summed up. Ex.: sense(sent1,voltar). A more particular provision (base condition), which clarifies the chose illustration, is fabricated. It generally comprises of the representation of all learning about that illustration. A provision that is more nonexclusive than the base statement is looked, by method for changed inquiry, assessment, and speculation systems. The best statement found is added to the hypothesis and the cases secured by such proviso are expelled from the illustration set. On the off chance that there are more examples, come back to step 1.

Slide 12

KS 1 Parser POS tagger Examples Bag-of-words (10) KS 2 POS of the Narrow Context (10) Rules to utilize POS Mode + sort + general settings Rules to utilize Bag-of-words (10) KS 3 ILP Inference Engine Rule-based model Subject-question syntactic relations Rules to utilize syntactic relations KS 4 KS 7 11 Collocations Rules to utilize definitions covering KS 5 Bag-of-words (10) Subject-protest syntactic relations Rules to utilize Collocations Definitions covering Verbs selectional limitations Rules to utilize selectional confinements Overlapping numbering Nouns semantic components Hierarchical relations KS 6 Bag-of-words (10) Verb definitions and illustrations LDOCE Wordnet Feature sorts progression Rules to utilize setting, ph. verbs & expressions LDOCE + Password Phrasal verbs and sayings Bilingual MRDs Bag-of-words (200)

Slide 13

Scope Experiments with: English-Portuguese MT No studies have analyzed English-Portuguese. 10 exceedingly visit and uncertain verbs Relevant and troublesome cases for English-Portuguese MT (Specia, 2005a). Learning from syntactic, semantic and businesslike sources Working on information which is particular for interpretation. Albeit particularly intended for MT of verbs, the approach can be adjusted for WSD of any words and dialects.

Slide 14

Sample information Corpus: fiction books, consequently labeled with the verb interpretation and physically investigated (Specia et al., 2005a).

Slide 15

Knowledge sources Example: sent1, verb " to come ": "If there is such an unbelievable marvel as resurrection, I would wouldn\'t fret returning as a squirrel". KS 1 : Bag-of-words – ± 5 words (lemmas) encompassing the verb for each sentence ( sent_id ) bag(sent_id, list_of_words). Ex.: bag(sent1, [mind,not,will,i,reincarnation,back,as,a,squirrel]) KS 2 : Part-of-discourse (POS) labels of substance words in a ±5 word window encompassing the verb has_pos(sent_id, word_position, pos). Ex.: has_pos(sent1, word_left_1, nn). has_pos(sent1, word_left_2, vbp). …

Slide 16

Knowledge sources KS 3 : Subject and protest syntactic relations concerning the verb has_rel(sent_id, subject_word, object_word). Ex.: has_rel(sent1, i, nil). KS 4 : Context words spoke to by 11 collocations as for the verb: first relational word to one side, first and second words to one side and right, first thing, first descriptive word, and first verb to one side and right has_collocation(sent_id, collocation_type, collocation). Ex.: has_collocation(sent1, word_right_1, back). has_collocation(sent1, word_left_1, mind). …

Slide 17

Knowledge sources KS 5 : Selectional limitations of verbs and semantic components of their contentions from LDOCE rest(verb, subj_restrition, obj_ confinement, interpretation) Ex.: rest(come, [], nil, voltar). rest(come, [animal,human], nil, vir). rest(come, [], nil, aparecer). ... feature(noun, sense_id, highlights). Ex.: feature(reincarnation, 0_1, [abstract]). feature(reincarnation, 0_2, [animate]). feature(squirrel, 0_0, [animal]). …

Slide 18

Knowledge sources KS 5 (cont.) : Hierarchy for LDOCE include sorts (Bruce and Guthrie, 1992) relation(feature 1 , highlight 2 ). Ex.: sub(human, vitalize). … Ontological relations from WordNet relation(word1, sense_id1, word2, sense_id2). Ex.: hyper(reincarnation, 1, symbol, 1). hyper(reincarnation, 3, religious_doctrine, 2). synon(rebirth, 2, resurrection, - 1). …

Slide 19

Knowledge sources KS 6 : Idioms and phrasal verbs exp(verbal_expression, interpretation) Ex.: exp(\'come about\', acontecer). exp(\'come about\', chegar). exp(\'come to realization\', amadurecer). … KS 7 : An include of the covering words lexicon definitions for the conceivable interpretations of the verb and the words encompassing it in the sentence highest_overlap(sent_id, interpretation, covering). Ex.: highest_overlap(sent1, voltar, 0.222222). highest_overlap(sent1, chegar, 0.0857143). …

Slide 20

Additional predicates Examples: sense(sent_id, interpretation ). Ex.: sense(sent1, voltar). sense(sent2, ir). … Mode definitions Ex.: :- modeh(1, sense(sent, interpretation)). :- modeb(11, has_collocation(sent, colloc_id, colloc)). :- modeb(10, has_bag(sent, word)). … Auxiliary predicates: Ex.: has_bag(Sent, Word) :- bag(Sent, List), member(Word, List). … bag(sent1, [mind,not,will,i,reincarnation,back,as,a,squirrel])

Slide 21

Example of principles delivered Verb " to come ": 1. sense(A, sair) :- has_collocation(A, preposition_right_1, out). 2. sense(A, chegar) :- satisfy_restrictions(A, [animal,human],[concrete]), has_expression(A, \'come at\'). 3. sense(A, vir) :- satisfy_restriction(A, [human],[abstract]); has_collocation(A, word_right_1, from); (has_rel(A, subj, B), (has_pos(B,nn);has_pos(B,pron))). 4. sense(A, passar) :- (has_bag(A, to), has_bag(A, propernoun)); highest_overlapping(A,passar). All together characterize new cases, rules must be connected in the request they are created.

Slide 22

Evaluation Induction strategies: incite : buil

View more...