Index of papers in Proc. ACL 2009 that mention
  • word alignment
Ge, Ruifang and Mooney, Raymond
Conclusion and Future work
The approach also exploits methods from statistical MT ( word alignment ) and therefore integrates techniques from statistical syntactic parsing, MT, and compositional semantics to produce an effective semantic parser.
Experimental Evaluation
We also evaluated the impact of the word alignment component by replacing Giza++ by gold-standard word alignments manually annotated for the CLANG corpus.
Experimental Evaluation
The results consistently showed that compared to using gold-standard word alignment , Giza++ produced lower semantic parsing accuracy when given very little training data, but similar or better results when given sufficient training data (> 160 examples).
Experimental Evaluation
This suggests that, given sufficient data, Giza++ can produce effective word alignments, and that imperfect word alignments do not seriously impair our semantic parsers since the disambiguation model evaluates multiple possible interpretations of ambiguous words.
Introduction
The learning system first employs a word alignment method from statistical machine translation (GIZA++ (Och and Ney, 2003)) to acquire a semantic lexicon that maps words to logical predicates.
Learning Semantic Knowledge
We use an approach based on Wong and Mooney (2006), which constructs word alignments between NL sentences and their MRs.
Learning Semantic Knowledge
Normally, word alignment is used in statistical machine translation to match words in one NL to words in another; here it is used to align words with predicates based on a ”parallel corpus” of NL sentences and MRs. We assume that each word alignment defines a possible mapping from words to predicates for building a SAPT and semantic derivation which compose the correct MR. A semantic lexicon and composition rules are then extracted directly from the
Learning Semantic Knowledge
Generation of word alignments for each training example proceeds as follows.
Learning a Disambiguation Model
Here, unique word alignments are not required, and alternative interpretations compete for the best semantic parse.
word alignment is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei
Abstract
In this paper we present a confidence measure for word alignment based on the posterior probability of alignment links.
Abstract
Based on these measures, we improve the alignment quality by selecting high confidence sentence alignments and alignment links from multiple word alignments of the same sentence pair.
Abstract
Additionally, we remove low confidence alignment links from the word alignment of a bilingual training corpus, which increases the alignment F-score, improves Chinese-English and Arabic-English translation quality and significantly reduces the phrase translation table size.
Introduction
Many MT systems, such as statistical phrase-based and syntax-based systems, learn phrase translation pairs or translation rules from large amount of bilingual data with word alignment .
Introduction
The quality of the parallel data and the word alignment have significant impacts on the learned translation models and ultimately the quality of translation output.
Introduction
Given the huge amount of bilingual training data, word alignments are automatically generated using various algorithms ((Brown et al., 1994), (Vogel et al., 1996)
word alignment is mentioned in 22 sentences in this paper.
Topics mentioned in this paper:
Sun, Jun and Zhang, Min and Tan, Chew Lim
Conclusions and Future Work
Although the characteristic of more sensitiveness to word alignment error enables SncTSSG to capture the additional noncontiguous language phenomenon, it also induces many redundant noncontiguous rules.
Experiments
We base on the m-to-n word alignments dumped by GIZA++ to extract the tree sequence pairs.
Experiments
The STSSG or any contiguous translational equivalence based model is unable to attain the corresponding target output for this idiom word via the noncontiguous word alignment and consider it as an out-of—vocabulary (OOV).
Experiments
On the contrary, the SncTSSG based model can capture the noncontiguous tree sequence pair consistent with the word alignment and further provide a reasonable target translation.
Introduction
(2006) statistically report that discontinuities are very useful for translational equivalence analysis using binary branching structures under word alignment and parse tree constraints.
Tree Sequence Pair Extraction
Data structure: p[j1, jg] to store tree sequence pairs covering source SpanU1,j2] 1: foreach source span [j1,j2], do 2: find a target span [i1,i2] with minimal length covering all the target words aligned to [j1,j2] 3: if all the target words in [i1,i2] are aligned with source words only in [j1,j2], then 4: Pair each source tree sequence covering [j1,j2] with those in target covering [i1,i2] as a contiguous tree sequence pair
Tree Sequence Pair Extraction
7: create sub-span set s([i1,i2]) to cover all the target words aligned to [j1,j2]
Tree Sequence Pair Extraction
13: find a source span [1'], jg] with minimal length covering all the source words aligned to [i1,i2]
word alignment is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Liu, Yang and Lü, Yajuan and Liu, Qun
Experiments
We obtained word alignments of the training data by first running GIZA++ (Och and Ney, 2003) and then applying the refinement rule “grow-diag-final-and” (Koehn et al., 2003).
Introduction
The solid lines denote hyperedges and the dashed lines denote word alignments .
Model
The solid lines denote hyperedges and the dashed lines denote word alignments between the two forests.
Rule Extraction
By constructing a theory that gives formal semantics to word alignments , Galley et al.
Rule Extraction
Their GHKM procedure draws connections among word alignments , derivations, and rules.
Rule Extraction
They first identify the tree nodes that subsume tree-string pairs consistent with word alignments and then extract rules from these nodes.
word alignment is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Ganchev, Kuzman and Gillenwater, Jennifer and Taskar, Ben
Approach
First, parser and word alignment errors cause much of the transferred information to be wrong.
Experiments
For both corpora, we performed word alignments with the open source PostCAT (Graca et al., 2009) toolkit.
Experiments
Preliminary experiments showed that our word alignments were not always appropriate for syntactic transfer, even when they were correct for translation.
Introduction
Nevertheless, several challenges to accurate training and evaluation from aligned bitext remain: (1) partial word alignment due to non-literal or distant translation; (2) errors in word alignments and source language parses, (3) grammatical annotation choices that differ across languages and linguistic theories (e. g., how to analyze auxiliary verbs, conjunctions).
Related Work
(2005) found that transferring dependencies directly was not sufficient to get a parser with reasonable performance, even when both the source language parses and the word alignments are performed by hand.
word alignment is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min and Aw, Aiti and Li, Haizhou
Experiments
To obtain word-level alignments, we ran GIZA++ (Och and Ney, 2000) on the remaining corpus in both directions, and applied the “grow-diag-final” refinement rule (Koehn et al., 2005) to produce the final many-to-many word alignments .
Introduction
According to the word alignments , we define bracketable and unbracketable instances.
The Acquisition of Bracketing Instances
Let c and e be the source sentence and the target sentence, W be the word alignment between them, T be the parse tree of c. We define a binary bracketing instance as a tuple (b,7'(cinj),7'(Cj+1nk),7'(cink)> where b E {bracketable,unbracketable}, cinj and cj+1nlc are two neighboring source phrases and 7'(T, 3) (7(3) for short) is a subtree function which returns the minimal subtree covering the source sequence 3 from the source parse tree T. Note that 7(cz-nk) includes both 7(cz-nj) and flog-+1.19).
The Acquisition of Bracketing Instances
1: Input: sentence pair (0, e), the parse tree T of c and the word alignment W between c and e 2: QR :2 (Z) 3: for each (i,j, k) E cdo 4: if There exist a target phrase can” aligned to Cinj and ep,,q aligned to Cj+1,_k; then
word alignment is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Branavan, S.R.K. and Chen, Harr and Zettlemoyer, Luke and Barzilay, Regina
Experimental Setup
Additionally, we compute a word alignment score to investigate the extent to which the input text is used to construct correct analyses.
Results
The word alignment results from Table 2 indicate that the learners are mapping the correct words to actions for documents that are successfully completed.
Results
For example, the models that perform best in the Windows domain achieve nearly perfect word alignment scores.
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Liang, Percy and Jordan, Michael and Klein, Dan
Experiments
Many of the remaining errors are due to the garbage collection phenomenon familiar from word alignment models (Moore, 2004; Liang et al., 2006).
Generative Model
The alignment aspect of our model is similar to the HMM model for word alignment (Ney and Vogel, 1996).
Generative Model
(2008) perform joint segmentation and word alignment for machine translation, but the nature of that task is different from ours.
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wu, Hua and Wang, Haifeng
Pivot Methods for Phrase-based SMT
(2003), there are two important elements in the lexical weight: word alignment information a in a phrase pair (5, f) and lexical translation probability w (s
Pivot Methods for Phrase-based SMT
Let a1 and a2 represent the word alignment information inside the phrase pairs (5,13) and (13, 2?)
Pivot Methods for Phrase-based SMT
Based on the the induced word alignment information, we estimate the co-occurring frequencies of word pairs directly from the induced phrase
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhang, Hui and Zhang, Min and Li, Haizhou and Aw, Aiti and Tan, Chew Lim
Experiment
GIZA++ (Och and Ney, 2003) and the heuristics “grow-diag-final-and” are used to generate m-to-n word alignments .
Experiment
This is mainly because tree sequence rules are only sensitive to word alignment while tree rules, even extracted from a forest (like in FT2S), are also limited by syntax according to grammar parsing rules.
Forest-based tree sequence to string model
Given a source forest F and target translation T S as well as word alignment A, our translation model is formulated as:
word alignment is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: