Index of papers in Proc. ACL 2010 that mention
  • sentence pair
Duan, Xiangyu and Zhang, Min and Li, Haizhou
Experiments and Results
Statistics of corpora, “Ch” denotes Chinese, “En” denotes English, “Sent.” row is the number of sentence pairs , “word” row is the number of words,
Experiments and Results
Figure 5 illustrates examples of pseudo-words of one Chinese-to-English sentence pair .
Experiments and Results
SSP has a strong constraint that all parts of a sentence pair should be aligned, so source sentence and target sentence have same length after merging words into
Introduction
But computational complexity is prohibitively high for the exponentially large number of decompositions of a sentence pair into phrase pairs.
Introduction
pressions by monotonically segmenting a given Spanish-English sentence pair into bilingual units, where word aligner is also used.
Searching for Pseudo-words
X and Y are sentence pair and multi-word pairs respectively in this bilingual scenario.
Searching for Pseudo-words
Pseudo-word pairs of one sentence pair are such pairs that maximize the sum of Span-pairs’ bilingual sequence significances: K pwpf = ARGMAXZkzlSigspan_pm-rk (6) span—patrl
Searching for Pseudo-words
Searching for pseudo-word pairs pwpIK is equal to bilingual segmentation of a sentence pair into optimal Span-pairIK.
sentence pair is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Wuebker, Joern and Mauser, Arne and Ney, Hermann
Alignment
The idea of forced alignment is to perform a phrase segmentation and alignment of each sentence pair of the training data using the full translation system as in decoding.
Alignment
Consequently, we can modify Equation 2 to define the best segmentation of a sentence pair as:
Alignment
The training data that consists of N parallel sentence pairs fn and en for n = l, .
Introduction
In this method, all phrases of the sentence pair that match constraints given by the alignment are extracted.
Related Work
When given a bilingual sentence pair , we can usually assume there are a number of equally correct phrase segmentations and corresponding alignments.
sentence pair is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Chen, Wenliang and Kazama, Jun'ichi and Torisawa, Kentaro
Bilingual subtree constraints
To solve the mapping problems, we use a bilingual corpus, which includes sentence pairs , to automatically generate the mapping rules.
Bilingual subtree constraints
First, the sentence pairs are parsed by monolingual parsers on both sides.
Bilingual subtree constraints
Figure 8 shows an example of a processed sentence pair that has tree structures on both sides and word alignment links.
Experiments
Note that some sentence pairs were removed because they are not one-to-one aligned at the sentence level (Burkett and Klein, 2008; Huang et al., 2009).
Experiments
Word alignments were generated from the Berkeley Aligner (Liang et al., 2006; DeNero and Klein, 2007) trained on a bilingual corpus having approximately 0.8M sentence pairs .
Motivation
Suppose that we have an input sentence pair as shown in Figure l, where the source sentence is in English, the target is in Chinese, the dashed undirected links are word alignment links, and the directed links between words indicate that they have a (candidate) dependency relation.
sentence pair is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Liu, Zhanyi and Wang, Haifeng and Wu, Hua and Li, Sheng
Collocation Model
The monolingual corpus is first replicated to generate a parallel corpus, where each sentence pair consists of two identical sentences in the same language.
Experiments on Word Alignment
To investigate the quality of the generated word alignments, we randomly selected a subset from the bilingual corpus as test set, including 500 sentence pairs .
Experiments on Word Alignment
(11), we also manually labeled a development set including 100 sentence pairs , in the same manner as the test set.
Improving Statistical Bilingual Word Alignment
According to the BWA method, given a bilingual sentence pair E = e11 and F = fl’” , the optimal
Improving Statistical Bilingual Word Alignment
Thus, the collocation probability of the alignment sequence of a sentence pair can be calculated according to Eq.
Improving Statistical Bilingual Word Alignment
model to calculate the word alignment probability of a sentence pair , as shown in Eq.
sentence pair is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Li, Chi-Ho and Zhou, Ming
Basics of ITG Parsing
Larger and larger span pairs are recursively built until the sentence pair is built.
Basics of ITG Parsing
Figure 1(a) shows one possible derivation for a toy example sentence pair with three words in each sentence.
Evaluation
The 491 sentence pairs in this dataset are adapted to our own Chinese word segmentation standard.
Evaluation
250 sentence pairs are used as training data and the other 241 are test data.
The DITG Models
The MERT module for DITG takes alignment F-score of a sentence pair as the performance measure.
The DITG Models
Given an input sentence pair and the reference annotated alignment, MERT aims to maximize the F-score of DITG-produced alignment.
The DPDI Framework
Discriminative approaches to word alignment use manually annotated alignment for sentence pairs .
The DPDI Framework
Discriminative pruning, however, handles not only a sentence pair but every possible span pair.
sentence pair is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Sun, Jun and Zhang, Min and Tan, Chew Lim
Substructure Spaces for BTKs
Chinese I English # of Sentence pair 5000 Avg.
Substructure Spaces for BTKs
We randomly select 300 bilingual sentence pairs from the Chinese-English FBIS corpus with the length S 30 in both the source and target sides.
Substructure Spaces for BTKs
The selected plain sentence pairs are further parsed by Stanford parser (Klein and Manning, 2003) on both the English and Chinese sides.
sentence pair is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Yamangil, Elif and Shieber, Stuart M.
Evaluation
This corpus consists of 1370 sentence pairs that were manually created from transcribed Broadcast News stories.
Evaluation
a pattern of compression used many times in the BNC in sentence pairs such as “NPR’s Anne Gar-rels reports” / “Anne Garrels reports”.
Sentence compression
An example sentence pair , which we use as a running example, is the following:
The STSG Model
See Figure l for an example of how an STSG with these rules would operate in synchronously generating our example sentence pair .
sentence pair is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Jiang, Wenbin and Liu, Qun
Experiments
It contains 239K sentence pairs with about 6.9M/8.9M words in Chi-neseflEnglish.
Experiments
The alignment matrixes for sentence pairs are generated according to (Liu et al., 2009).
Projected Classification Instance
Suppose a bilingual sentence pair , composed of a source sentence e and its target translation f. ye is the parse tree of the source sentence.
sentence pair is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: