Index of papers in Proc. ACL 2011 that mention
  • phrase pairs
Neubig, Graham and Watanabe, Taro and Sumita, Eiichiro and Mori, Shinsuke and Kawahara, Tatsuya
Flat ITG Model
(a) If cc 2 TERM, generate a phrase pair from the phrase table Pt((e, f ); 67;).
Flat ITG Model
(b) If cc 2 REG, a regular ITG rule, generate phrase pairs (el, f1) and <62, f2) from Pflat, and concatenate them into a single phrase pair (6162, f1f2>.
Flat ITG Model
While the previous formulation can be used as-is in maximum likelihood training, this leads to a degenerate solution where every sentence is memorized as a single phrase pair .
Introduction
The training of translation models for phrase-based statistical machine translation (SMT) systems (Koehn et al., 2003) takes unaligned bilingual training data as input, and outputs a scored table of phrase pairs .
Introduction
The model is similar to previously proposed phrase alignment models based on inversion transduction grammars (ITGs) (Cherry and Lin, 2007; Zhang et al., 2008; Blunsom et al., 2009), with one important change: ITG symbols and phrase pairs are generated in the opposite order.
Introduction
In traditional ITG models, the branches of a biparse tree are generated from a nonterminal distribution, and each leaf is generated by a word or phrase pair distribution.
phrase pairs is mentioned in 33 sentences in this paper.
Topics mentioned in this paper:
Zollmann, Andreas and Vogel, Stephan
Clustering phrase pairs directly using the K-means algorithm
Using multiple word clusterings simultaneously, each based on a different number of classes, could turn this global, hard tradeoff into a local, soft one, informed by the number of phrase pair instances available for a given granularity.
Clustering phrase pairs directly using the K-means algorithm
We thus propose to represent each phrase pair instance (including its bilingual one-word contexts) as feature vectors, i.e., points of a vector space.
Clustering phrase pairs directly using the K-means algorithm
then use these data points to partition the space into clusters, and subsequently assign each phrase pair instance the cluster of its corresponding feature vector as label.
Hard rule labeling from word classes
(2003) to provide us with a set of phrase pairs for each sentence pair in the training corpus, annotated with their respective start and end positions in the source and target sentences.
Hard rule labeling from word classes
We convert each extracted phrase pair , represented by its source span (2', j) and target span (19,6), into an initial rule
Hard rule labeling from word classes
Then (depending on the extracted phrase pairs ), the resulting initial rules could be:
Introduction
Zollmann and Venugopal (2006) directly extend the rule extraction procedure from Chiang (2005) to heuristically label any phrase pair based on target language parse trees.
PSCFG-based translation
Chiang (2005) learns a single-nonterminal PSCFG from a bilingual corpus by first identifying initial phrase pairs using the technique from Koehn et al.
PSCFG-based translation
(2003), and then performing a generalization operation to generate phrase pairs with gaps, which can be viewed as PSCFG rules with generic ‘X’ nonterminal left-hand-sides and substitution sites.
phrase pairs is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
DeNero, John and Macherey, Klaus
Conclusion
The resulting predictions improve the precision and recall of both alignment links and extraced phrase pairs in Chinese-English experiments.
Experimental Results
In this way, we can show that the bidirectional model improves alignment quality and enables the extraction of more correct phrase pairs .
Experimental Results
Table 3: Phrase pair extraction accuracy for phrase pairs up to length 5.
Experimental Results
Possible links are both included and excluded from phrase pairs during extraction, as in DeNero and Klein (2010).
phrase pairs is mentioned in 6 sentences in this paper.
Topics mentioned in this paper: