Index of papers in Proc. ACL that mention
  • sentence pairs
Liu, Le and Hong, Yu and Liu, Hao and Wang, Xing and Yao, Jianmin
Abstract
Most current data selection methods solely use language models trained on a small scale in-domain data to select domain-relevant sentence pairs from general-domain parallel corpus.
Abstract
By contrast, we argue that the relevance between a sentence pair and target domain can be better evaluated by the combination of language model and translation model.
Abstract
When the selected sentence pairs are evaluated on an end-to-end MT task, our methods can increase the translation performance by 3 BLEU points.
Introduction
For this, an effective approach is to automatically select and eXpand domain-specific sentence pairs from large scale general-domain parallel corpus.
Introduction
Current data selection methods mostly use language models trained on small scale in-domain data to measure domain relevance and select domain-relevant parallel sentence pairs to expand training corpora.
Introduction
Meanwhile, the translation model measures the translation probability of sentence pair , being used to verify the parallelism of the selected domain-relevant bitext.
Related Work
(2010) ranked the sentence pairs in the general-domain corpus according to the perplexity scores of sentences, which are computed with respect to in-domain language models.
Related Work
Although previous works in data selection (Duh et al., 2013; Koehn and Haddow, 2012; Axelrod et al., 2011; Foster et al., 2010; Yasuda et al., 2008) have gained good performance, the methods which only adopt language models to score the sentence pairs are suboptimal.
Related Work
The reason is that a sentence pair contains a source language sentence and a target language sentence, while the existing methods are incapable of evaluating the mutual translation probability of sentence pair in the target domain.
Training Data Selection Methods
We present three data selection methods for ranking and selecting domain-relevant sentence pairs from general-domain corpus, with an eye towards improving domain-specific translation model performance.
Training Data Selection Methods
However, in this paper, we adopt the translation model to evaluate the translation probability of sentence pair and develop a simple but effective variant of translation model to rank the sentence pairs in the general-domain corpus.
sentence pairs is mentioned in 35 sentences in this paper.
Topics mentioned in this paper:
Chang, Yin-Wen and Rush, Alexander M. and DeNero, John and Collins, Michael
Abstract
The algorithm finds provably exact solutions on 86% of sentence pairs and shows improvements over directional models.
Conclusion
optimal score, averaged over all sentence pairs .
Conclusion
The plot shows an average over all sentence pairs .
Conclusion
We can construct a sentence pair in which I = J = N and e-alignments have infinite cost.
Experiments
Following past work, the first 150 sentence pairs of the training section are used for evaluation.
Experiments
With incremental constraints and pruning, we are able to solve over 86% of sentence pairs including many longer and more difficult pairs.
Experiments
Table 3: The average number of constraints added for sentence pairs where Lagrangian relaxation is not able to find an exact solution.
Introduction
o Empirically, it is able to find exact solutions on 86% of sentence pairs and is significantly faster than general-purpose solvers.
sentence pairs is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun
Experiments
These documents are built in the format of inverted index using Lucene2, which can be efficiently retrieved by the parallel sentence pairs .
Experiments
In the fine-tuning phase, for each parallel sentence pair, we randomly select other ten sentence pairs which satisfy the criterion as negative instances.
Experiments
In total, the datasets contain nearly 1.1 million sentence pairs .
Topic Similarity Model with Neural Network
Given a parallel sentence pair ( f, e) , the first step is to treat f and e as queries, and use IR methods to retrieve relevant documents to enrich contextual information for them.
Topic Similarity Model with Neural Network
Therefore, in this stage, parallel sentence pairs are used to help connecting the vectors from different languages because they express the same topic.
Topic Similarity Model with Neural Network
Given a parallel sentence pair ( f, e), the DAB learns representations for f and 6 respectively, as zf = g(f) and ze = g(e) in Figure 1.
sentence pairs is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin
A Joint Model with Unlabeled Parallel Text
the sentence pairs may be noisily parallel (or even comparable) instead of fully parallel (Munteanu and Marcu, 2005).
A Joint Model with Unlabeled Parallel Text
In such noisy cases, the labels (positive or negative) could be different for the two monolingual sentences in a sentence pair .
A Joint Model with Unlabeled Parallel Text
Although we do not know the exact probability that a sentence pair exhibits the same label, we can approximate it using their translation
Experimental Setup 4.1 Data Sets and Preprocessing
Because sentence pairs in the ISI corpus are quite noisy, we rely on Giza++ (Och and Ney, 2003) to obtain a new translation probability for each sentence pair , and select the 100,000 pairs with the highest translation probabilities.5
Experimental Setup 4.1 Data Sets and Preprocessing
We then classify each unlabeled sentence pair by combining the two sentences in each pair into one.
Experimental Setup 4.1 Data Sets and Preprocessing
5We removed sentence pairs with an original confidence score (given in the corpus) smaller than 0.98, and also removed the pairs that are too long (more than 60 characters in one sentence) to facilitate Giza++.
Results and Analysis
Preliminary experiments showed that Equation 5 does not significantly improve the performance in our case, which is reasonable since we choose only sentence pairs with the highest translation probabilities to be our unlabeled data (see Section 4.1).
Results and Analysis
However, even with only 2,000 unlabeled sentence pairs , the proposed approach still produces large performance gains.
Results and Analysis
Examination of those sentence pairs in setting 2 for which the two monolingual models still
sentence pairs is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Wuebker, Joern and Mauser, Arne and Ney, Hermann
Alignment
The idea of forced alignment is to perform a phrase segmentation and alignment of each sentence pair of the training data using the full translation system as in decoding.
Alignment
Consequently, we can modify Equation 2 to define the best segmentation of a sentence pair as:
Alignment
The training data that consists of N parallel sentence pairs fn and en for n = l, .
Introduction
In this method, all phrases of the sentence pair that match constraints given by the alignment are extracted.
Related Work
When given a bilingual sentence pair , we can usually assume there are a number of equally correct phrase segmentations and corresponding alignments.
sentence pairs is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Ling, Wang and Xiang, Guang and Dyer, Chris and Black, Alan and Trancoso, Isabel
Experiments
The y-aXis denotes the scores for each metric, and the X-aXis denotes the percentage of the highest scoring sentence pairs that are kept.
Experiments
However, translation models are generally robust to such kinds of errors and can learn good translations even in the presence of imperfect sentence pairs .
Experiments
Example sentence pairs .
Parallel Data Extraction
In this process, lexical tables for EN-ZH language pair used by Model 1 were built using the FBIS dataset (LDC2003E14) for both directions, a corpus of 300K sentence pairs from the news domain.
Parallel Data Extraction
Likewise, for the EN-AR language pair, we use a fraction of the NIST dataset, by removing the data originated from UN, which leads to approximately 1M sentence pairs .
Parallel Segment Retrieval
This is obviously not our goal, since we would not obtain any useful sentence pairs .
Parallel Segment Retrieval
It is highest for segmentations that cover all the words in the document (this is desirable since there are many sentence pairs that can be extracted but we want to find the largest sentence pair in the document).
sentence pairs is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Duan, Xiangyu and Zhang, Min and Li, Haizhou
Experiments and Results
Statistics of corpora, “Ch” denotes Chinese, “En” denotes English, “Sent.” row is the number of sentence pairs , “word” row is the number of words,
Experiments and Results
Figure 5 illustrates examples of pseudo-words of one Chinese-to-English sentence pair .
Experiments and Results
SSP has a strong constraint that all parts of a sentence pair should be aligned, so source sentence and target sentence have same length after merging words into
Introduction
But computational complexity is prohibitively high for the exponentially large number of decompositions of a sentence pair into phrase pairs.
Introduction
pressions by monotonically segmenting a given Spanish-English sentence pair into bilingual units, where word aligner is also used.
Searching for Pseudo-words
X and Y are sentence pair and multi-word pairs respectively in this bilingual scenario.
Searching for Pseudo-words
Pseudo-word pairs of one sentence pair are such pairs that maximize the sum of Span-pairs’ bilingual sequence significances: K pwpf = ARGMAXZkzlSigspan_pm-rk (6) span—patrl
Searching for Pseudo-words
Searching for pseudo-word pairs pwpIK is equal to bilingual segmentation of a sentence pair into optimal Span-pairIK.
sentence pairs is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei
Abstract
Based on these measures, we improve the alignment quality by selecting high confidence sentence alignments and alignment links from multiple word alignments of the same sentence pair .
Alignment Link Confidence Measure
Similar to the sentence alignment confidence measure, the confidence of an alignment link aij in the sentence pair (S, T) is defined as
Alignment Link Confidence Measure
From multiple alignments of the same sentence pair , we select high confidence links from different alignments based on their link confidence scores and alignment agreement ratio.
Alignment Link Confidence Measure
Suppose the sentence pair (8, T) have alignments A1,.
Improved MaXEnt Aligner with Confidence-based Link Filtering
512 sentence pairs, and the A-E alignment test set is the 200 Arabic-English sentence pairs from NIST MT03 test set.
Introduction
The example in Figure 1 shows the word alignment of the given Chinese and English sentence pair , where the English words following each Chinese word is its literal translation.
Introduction
In this paper we introduce a confidence measure for word alignment, which is robust to extra or missing words in the bilingual sentence pairs , as well as word alignment errors.
Sentence Alignment Confidence Measure
Given a bilingual sentence pair (S,T) where S={31,. .
Sentence Alignment Confidence Measure
We randomly selected 512 Chinese-English (CE) sentence pairs and generated word alignment using the MaxEnt aligner (Ittycheriah and Roukos, 2005).
Sentence Alignment Confidence Measure
For each sentence pair , we also calculate the sentence alignment confidence score — log 0 (A|S, T).
sentence pairs is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Chan, Yee Seng and Ng, Hwee Tou
Automatic Evaluation Metrics
A uniform average of the counts is then taken as the score for the sentence pair .
Automatic Evaluation Metrics
Based on the matches, ParaEval will then elect to use either unigram precision or unigram recall as its score for the sentence pair .
Introduction
To match each system item to at most one reference item, we model the items in the sentence pair as nodes in a bipartite graph and use the Kuhn-Munkres algorithm (Kuhn, 1955; Munkres, 1957) to find a maximum weight matching (or alignment) between the items in polynomial time.
Introduction
Also, metrics such as METEOR determine an alignment between the items of a sentence pair by using heuristics such as the least number of matching crosses.
Introduction
Also, this framework allows for defining arbitrary similarity functions between two matching items, and we could match arbitrary concepts (such as dependency relations) gathered from a sentence pair .
Metric Design Considerations
Similarly, we also match the bigrams and trigrams of the sentence pair and calculate their corresponding Fmean scores.
Metric Design Considerations
To obtain a single similarity score scores for this sentence pair 3, we simply average the three Fmean scores.
Metric Design Considerations
Then, to obtain a single similarity score Sim-score for the entire system corpus, we repeat this process of calculating a scores for each system-reference sentence pair 3, and compute the average over all |S | sentence pairs:
sentence pairs is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Yang, Nan and Liu, Shujie and Li, Mu and Zhou, Ming and Yu, Nenghai
DNN for word alignment
Given a sentence pair (e, f), HMM word alignment takes the following form:
DNN for word alignment
To decode our model, the lexical translation scores are computed for each source-target word pair in the sentence pair , which requires going through the neural network (|e| >< |f times; after that, the forward-backward algorithm can be used to find the viterbi path as in the classic HMM model.
Experiments and Results
We use the manually aligned Chinese-English alignment corpus (Haghighi et al., 2009) which contains 491 sentence pairs as test set.
Experiments and Results
Our parallel corpus contains about 26 million unique sentence pairs in total which are mined from web.
Training
1In practice, the number of nonzero parameters in classic HMM model would be much smaller, as many words do not co-occur in bilingual sentence pairs .
Training
our model from raw sentence pairs , they are too computational demanding as the lexical translation probabilities must be computed from neural networks.
Training
Hence, we opt for a simpler supervised approach, which learns the model from sentence pairs with word alignment.
sentence pairs is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Yang, Nan and Li, Mu and Zhang, Dongdong and Yu, Nenghai
Experiments
They are manually translated into the other language to produce 7,000 sentence pairs , which are split into two parts: 2,000 pairs as development set (dev) and the other 5,000 pairs as test set (web test).
Experiments
After removing duplicates, we have about 18 million sentence pairs , which contain about 270 millions of English tokens and 320 millions of Japanese tokens.
Experiments
As we do not have access to a golden reordered sentence set, we decide to use the alignment crossing-link numbers between aligned sentence pairs as the measure for reorder performance.
Ranking Model Training
For a sentence pair (6, f, a) with syntax tree Te on the source side, we need to determine which reordered tree Té, best represents the word order in target sentence f. For a tree node 75 in T6, if its children align to disjoint target spans, we can simply arrange them in the order of their corresponding target
Ranking Model Training
Figure 2: Fragment of a sentence pair .
Ranking Model Training
Figure 2 shows a fragment of one sentence pair in our training data.
Word Reordering as Syntax Tree Node Ranking
Figure 1: An English-to-Japanese sentence pair .
sentence pairs is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Zhang, Hao and Quirk, Chris and Moore, Robert C. and Gildea, Daniel
Bootstrapping Phrasal ITG from Word-based ITG
Figure 3 (a) shows all possible non-compositional phrases given the Viterbi word alignment of the example sentence pair .
Experiments
The training data was a subset of 175K sentence pairs from the NIST Chinese-English training data, automatically selected to maximize character-level overlap with the source side of the test data.
Experiments
We put a length limit of 35 on both sides, producing a training set of 141K sentence pairs .
Experiments
Figure 4 examines the difference between EM and VB with varying sparse priors for the word-based model of ITG on the 500 sentence pairs , both after 10 iterations of training.
Introduction
Finally, the set of phrases consistent with the word alignments are extracted from every sentence pair ; these form the basis of the decoding process.
Introduction
Computational complexity arises from the exponentially large number of decompositions of a sentence pair into phrase pairs; overfitting is a problem because as EM attempts to maximize the likelihood of its training data, it prefers to directly explain a sentence pair with a single phrase pair.
Phrasal Inversion Transduction Grammar
However, it is easy to show that the maximum likelihood training will lead to the saturated solution where PC = l —each sentence pair is generated by a single phrase spanning the whole sentence.
Summary of the Pipeline
Then we use the efficient bidirectional tic-tac-toe pruning to prune the bitext space within each of the sentence pairs ; ITG parsing will be carried out on only this this sparse set of bitext cells.
Variational Bayes for ITG
If we do not put any constraint on the distribution of phrases, EM overfits the data by memorizing every sentence pair .
sentence pairs is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Nguyen, ThuyLinh and Vogel, Stephan
Experiment Results
The Arabic-English system was trained from 264K sentence pairs with true case English.
Experiment Results
The Chinese-English system was trained on FBIS corpora of 384K sentence pairs , the English corpus is lower case.
Experiment Results
The systems were trained on 1.8 million sentence pairs using the Europarl corpora.
Introduction
From this Hiero derivation, we have a segmentation of the sentence pairs into phrase pairs according to the word alignments, as shown on the left side of Figure 1.
Phrasal-Hiero Model
In the rule X —> Je X1 [6 Francais ; I X1 french extract from sentence pair in Figure l, the phrase le Frangais connects to the phrase french because the French word Frangais aligns with the English word french even though le is unaligned.
Phrasal-Hiero Model
Figure 2: Alignment of a sentence pair .
Phrasal-Hiero Model
For exam-pleintheruler4 = X —> je X1 le X2 ; 2' X1 X2 extracted from the sentence pair in Figure 2, the phrase le is not aligned.
sentence pairs is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
van Gompel, Maarten and van den Bosch, Antal
Data preparation
2. for each aligned sentence pair (sentences E SS, sentencet E St) in the parallel corpus split (88,875):
Data preparation
The output of the algorithm in Figure 1 is a modified set of sentence pairs (sentence; sentencet), in which the same sentence pair may be used multiple times with different Ll substitutions for different fragments.
Evaluation
If output 0 is a subset of reference 7“ then a score of % is assigned for that sentence pair .
Evaluation
The word accuracy for the entire set is then computed by taking the sum of the word accuracies per sentence pair, divided by the total number of sentence pairs .
Experiments & Results
The data for our experiments were drawn from the Europarl parallel corpus (Koehn, 2005) from which we extracted two sets of 200, 000 sentence pairs each for several language pairs.
Experiments & Results
The final test sets are a randomly sampled 5, 000 sentence pairs from the 200, 000-sentence test split for each language pair.
Experiments & Results
English fallback in a Spanish context, consists of 5, 608, 015 sentence pairs .
sentence pairs is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Li, Chi-Ho and Li, Mu and Zhou, Ming
Experiments and Results
The training data contains 81k sentence pairs , 655k Chinese words and 806 English words.
Experiments and Results
The training data contains 354k sentence pairs , 8M Chinese words and 10M English words.
Experiments and Results
re-ranking methods are performed in the same way as for IWSLT data, but for consensus-based decoding, the data set contains too many sentence pairs to be held in one graph for our machine.
Features and Training
For the nodes representing the training sentence pairs , this posterior is fixed.
Graph Construction
If there are sentence pairs with the same source sentence but different translations, all the translations will be assigned as labels to that source sentence, and the corresponding probabilities are estimated by MLE.
Graph Construction
There is no edge between training nodes, since we suppose all the sentences of the training data are correct, and it is pointless to reestimate the confidence of those sentence pairs .
Graph Construction
Forced alignment performs phrase segmentation and alignment of each sentence pair of the training data using the full translation system as in decoding (Wuebker et al., 2010).
sentence pairs is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Kim, Sungchul and Toutanova, Kristina and Yu, Hwanjo
Data and task
A total of 13,410 English-Bulgarian and 8,832 English-Korean sentence pairs were extracted.
Data and task
Of these, we manually annotated 91 English-Bulgarian and 79 English-Korean sentence pairs with source and target named entities as well as word-alignment links among named entities in the two languages.
Data and task
Figure 1 illustrates a Bulgarian-English sentence pair with alignment.
Introduction
Our results show that the semi-CRF model improves on the performance of projection models by more than 10 points in F—measure, and that we can achieve tagging F-measure of over 91 using a very small number of annotated sentence pairs .
sentence pairs is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Chen, Wenliang and Kazama, Jun'ichi and Torisawa, Kentaro
Bilingual subtree constraints
To solve the mapping problems, we use a bilingual corpus, which includes sentence pairs , to automatically generate the mapping rules.
Bilingual subtree constraints
First, the sentence pairs are parsed by monolingual parsers on both sides.
Bilingual subtree constraints
Figure 8 shows an example of a processed sentence pair that has tree structures on both sides and word alignment links.
Experiments
Note that some sentence pairs were removed because they are not one-to-one aligned at the sentence level (Burkett and Klein, 2008; Huang et al., 2009).
Experiments
Word alignments were generated from the Berkeley Aligner (Liang et al., 2006; DeNero and Klein, 2007) trained on a bilingual corpus having approximately 0.8M sentence pairs .
Motivation
Suppose that we have an input sentence pair as shown in Figure l, where the source sentence is in English, the target is in Chinese, the dashed undirected links are word alignment links, and the directed links between words indicate that they have a (candidate) dependency relation.
sentence pairs is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Liu, Zhanyi and Wang, Haifeng and Wu, Hua and Li, Sheng
Collocation Model
The monolingual corpus is first replicated to generate a parallel corpus, where each sentence pair consists of two identical sentences in the same language.
Experiments on Word Alignment
To investigate the quality of the generated word alignments, we randomly selected a subset from the bilingual corpus as test set, including 500 sentence pairs .
Experiments on Word Alignment
(11), we also manually labeled a development set including 100 sentence pairs , in the same manner as the test set.
Improving Statistical Bilingual Word Alignment
According to the BWA method, given a bilingual sentence pair E = e11 and F = fl’” , the optimal
Improving Statistical Bilingual Word Alignment
Thus, the collocation probability of the alignment sequence of a sentence pair can be calculated according to Eq.
Improving Statistical Bilingual Word Alignment
model to calculate the word alignment probability of a sentence pair , as shown in Eq.
sentence pairs is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Li, Chi-Ho and Zhou, Ming
Basics of ITG Parsing
Larger and larger span pairs are recursively built until the sentence pair is built.
Basics of ITG Parsing
Figure 1(a) shows one possible derivation for a toy example sentence pair with three words in each sentence.
Evaluation
The 491 sentence pairs in this dataset are adapted to our own Chinese word segmentation standard.
Evaluation
250 sentence pairs are used as training data and the other 241 are test data.
The DITG Models
The MERT module for DITG takes alignment F-score of a sentence pair as the performance measure.
The DITG Models
Given an input sentence pair and the reference annotated alignment, MERT aims to maximize the F-score of DITG-produced alignment.
The DPDI Framework
Discriminative approaches to word alignment use manually annotated alignment for sentence pairs .
The DPDI Framework
Discriminative pruning, however, handles not only a sentence pair but every possible span pair.
sentence pairs is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
DeNero, John and Macherey, Klaus
Introduction
Word alignment is the task of identifying corresponding words in sentence pairs .
Model Definition
Our bidirectional model Q = (12,13) is a globally normalized, undirected graphical model of the word alignment for a fixed sentence pair (6, f Each vertex in the vertex set V corresponds to a model variable Vi, and each undirected edge in the edge set D corresponds to a pair of variables (W, Each vertex has an associated potential function w, that assigns a real-valued potential to each possible value v,- of 16.1 Likewise, each edge has an associated potential function gig-(vi, 213-) that scores pairs of values.
Model Definition
The highest probability word alignment vector under the model for a given sentence pair (6, f) can be computed exactly using the standard Viterbi algorithm for HMMs in O(|e|2 - time.
Model Definition
Figure l: The structure of our graphical model for a simple sentence pair .
Model Inference
Moreover, the value of u is specific to a sentence pair .
Model Inference
Memory requirements are virtually identical to the baseline: only 11 must be stored for each sentence pair as it is being processed, but can then be immediately discarded once alignments are inferred.
sentence pairs is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Quan, Xiaojun and Kit, Chunyu and Song, Yan
Methodology 2.1 The Problem
Its output is then double-checked and corrected by two experts in bilingual studies, resulting in a data set of 1747 1-1 and 70 1-0 or 0-1 sentence pairs .
Methodology 2.1 The Problem
| i | i | i | 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Similarity of English sentence pair
Methodology 2.1 The Problem
The horizontal axis is the similarity of English sentence pairs and the vertical is the similarity of the corresponding pairs in Chinese.
sentence pairs is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Smith, Jason R. and Saint-Amand, Herve and Plamada, Magdalena and Koehn, Philipp and Callison-Burch, Chris and Lopez, Adam
Abstract
Sentence Filtering Since we do not perform any boilerplate removal in earlier steps, there are many sentence pairs produced by the pipeline which contain menu items or other bits of text which are not useful to an SMT system.
Abstract
To measure this, we conducted a manual analysis of 200 randomly selected sentence pairs for each of three language pairs.
Abstract
Table 2: Manual evaluation of precision (by sentence pair ) on the extracted parallel data for Spanish, French, and German (paired with English).
sentence pairs is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei and Xu, Jian-Ming and Ittycheriah, Abraham and Roukos, Salim
Adaptive MT Quality Estimation
Our proposed method is as follows: we select a fixed set of sentence pairs (Sq, Rq) to train the QE model.
Discussion and Conclusion
Another option is to select the sentence pairs from the MT system subsampled training data, which is more similar to the input document thus the trained QE model could be a better match to the input document.
Document-specific MT System
Our parallel corpora includes tens of millions of sentence pairs covering a wide range of topics.
Document-specific MT System
The document-specific system is built based on sub-sampling: from the parallel corpora we select sentence pairs that are the most similar to the sentences from the input document, then build the MT system with the sub-sampled sentence pairs .
Document-specific MT System
From the extracted sentence pairs , we utilize the standard pipeline in SMT system building: word align-
Introduction
First, existing approaches to MT quality estimation rely on lexical and syntactical features defined over parallel sentence pairs , which includes source sentences, MT outputs and references, and translation models (Blatz et al., 2004; Ueffing and Ney, 2007; Specia et al., 2009a; Xiong et al., 2010; Soricut and Echihabi, 2010a; Bach et al., 2011).
Static MT Quality Estimation
The high FM phrases are selected from sentence pairs which are closest in terms of n-gram overlap to the input sentence.
sentence pairs is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Feng, Minwei and Peter, Jan-Thorsten and Ney, Hermann
Comparative Study
The interpretation is that given the sentence pair ( f17 , 6;) and its alignment, the correct translation order is 631—762—163,€3—f1,64—f4765—f4,€6—f6—f7767—f5-Notice the bilingual units have been ordered according to the target side, as the decoder writes the translation in a left-to-right way.
Comparative Study
After the operation in Figure 4 was done for all bilingual sentence pairs , we get a decoding sequence corpus.
Experiments
Firstly, we delete the sentence pairs if the source sentence length is one.
Experiments
Secondly, we delete the sentence pairs if the source sentence contains more than three contiguous unaligned words.
Experiments
When this happens, the sentence pair is usually low quality hence not suitable for learning.
Tagging-style Reordering Model
The transformation in Figure 1 is conducted for all the sentence pairs in the bilingual training corpus.
Tagging-style Reordering Model
During the search, a sentence pair ( 1‘], (if) will be formally splitted into a segmentation Sff which consists of K phrase pairs.
sentence pairs is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
DeNero, John and Chiang, David and Knight, Kevin
Consensus Decoding Algorithms
2.1 Minimum Bayes Risk over Sentence Pairs
Consensus Decoding Algorithms
Algorithm 1 MBR over Sentence Pairs 12 A <— —00 2: for e E E do 3: A6 <— 0 4 for e’ E E do 5: Ae<—A6+P(e’|f)-S(e;e’) 6 7
Consensus Decoding Algorithms
MBR over Sentence Pairs MBR over Features
Experimental Results
A phrase discovery procedure over word-aligned sentence pairs provides rule frequency counts, which are normalized to estimate features on rules.
Experimental Results
The synchronous grammar rules are extracted from word aligned sentence pairs where the target sentence is annotated with a syntactic parse (Galley et al., 2004).
sentence pairs is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Paperno, Denis and Pham, Nghia The and Baroni, Marco
Evaluation
(2013), features 200 sentence pairs that were rated for similarity by 43 annotators.
Evaluation
We also consider a similar data set introduced by Grefenstette (2013), comprising 200 sentence pairs rated by 50 annotators.
Evaluation
Evaluation is carried out by computing the Spearman correlation between the annotator similarity ratings for the sentence pairs and the cosines of the vectors produced by the various systems for the same sentence pairs .
sentence pairs is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Wang, Kun and Zong, Chengqing and Su, Keh-Yih
Experiments
We randomly selected a development set and a test set, and then the remaining sentence pairs are for training set.
Experiments
The remaining 28.3% of the sentence pairs are thus not adopted for generating training samples.
Introduction
They first determine whether the extracted TM sentence pair should be adopted or not.
Problem Formulation
is the final translation; [tm_s,tm_t,tm_f,s_a,tm_a] are the associated information of the best TM sentence-pair; tm_s and tm_t denote the corresponding TM sentence pair ; tm_f denotes its associated fuzzy match score (from 0.0 to 1.0); 8_a is the editing operations between tm_8 and s; and tm_a denotes the word alignment between tm_s and tmi.
Problem Formulation
mula (3) is just the typical phrase-based SMT model, and the second factor P(Mk|Lk, 2:) (to be specified in the Section 3) is the information derived from the TM sentence pair .
Problem Formulation
useful information from the best TM sentence pair to guide SMT decoding.
sentence pairs is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Schoenemann, Thomas
Conclusion
Table 2: Evaluation of phrase-based translation from German to English with the obtained alignments (for 100.000 sentence pairs ).
Introduction
The downside of our method is its resource consumption, but still we present results on corpora with 100.000 sentence pairs .
See e. g. the author’s course notes (in German), currently
However, since we approximate expectations from the move and swap matrices, and hence by (9((1 + J) - J) alignments per sentence pair , in the end we get a polynomial number of terms.
See e. g. the author’s course notes (in German), currently
We use MOSES with a 5-gram language model (trained on 500.000 sentence pairs ) and the standard setup in the MOSES Experiment Management System: training is run in both directions, the alignments are combined using diag—grow—final—and (Och and Ney, 2003) and the parameters of MOSES are optimized on 750 development sentences.
Training the New Variants
sentence pairs 3 = l, .
Training the New Variants
This task is also needed for the actual task of word alignment (annotating a given sentence pair with an alignment).
sentence pairs is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
He, Wei and Wu, Hua and Wang, Haifeng and Liu, Ting
Experiments
After tokenization and filtering, this bilingual corpus contained 319,694 sentence pairs (7.9M tokens on
Extraction of Paraphrase Rules
3.2 Selecting Paraphrase Sentence Pairs
Extraction of Paraphrase Rules
If the sentence in T 2 has a higher BLEU score than the aligned sentence in T1, the corresponding sentences in S0 and S1 are selected as candidate paraphrase sentence pairs , which are used in the following steps of paraphrase extractions.
Extraction of Paraphrase Rules
From the word-aligned sentence pairs , we then extract a set of rules that are consistent with the word alignments.
Forward-Translation vs. Back-Translation
The aligned sentence pairs in (S0, S1) can be considered as paraphrases.
sentence pairs is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Feng, Yang and Cohn, Trevor
Experiments
Here the training data consists of the non-UN portions and non-HK Hansards portions of the NIST training corpora distributed by the LDC, totalling 303k sentence pairs with 8m and 9.4m words of Chinese and English, respectively.
Experiments
Overall there are 276k sentence pairs and 8.21m and 8.97m words in Arabic and English, respectively.
Gibbs Sampling
Specifically we seek to infer the latent sequence of translation decisions given a corpus of sentence pairs .
Gibbs Sampling
It visits each sentence pair in the corpus in a random order and resamples the alignments for each target position as follows.
Model
Therefore, we introduce fertility to denote the number of target positions a source word is linked to in a sentence pair .
Model
where gbj is the fertility of source word fj in the sentence pair < fi],e{ > and p58 is the basic model defined in Eq.
sentence pairs is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Zhu, Conghui and Watanabe, Taro and Sumita, Eiichiro and Zhao, Tiejun
Experiment
Table l: The sentence pairs used in each data set.
Experiment
The parameter ’samps’ is set to 5, which indicates 5 samples are generated for a sentence pair .
Experiment
However, if most domains are similar (FBIS data set) or if there are enough parallel sentence pairs (NIST data set) in each domain, then the translation performances are almost similar even with the opposite integrating orders.
Introduction
Since SMT systems trend to employ very large scale training data for translation knowledge extraction, updating several sentence pairs each time will be annihilated in the existing corpus.
Phrase Pair Extraction with Unsupervised Phrasal ITGs
ITG is a synchronous grammar formalism which analyzes bilingual text by introducing inverted rules, and each ITG derivation corresponds to the alignment of a sentence pair (Wu, 1997).
Phrase Pair Extraction with Unsupervised Phrasal ITGs
Figure 1 (b) illustrates an example of the phrasal ITG derivation for word alignment in Figure l (a) in which a bilingual sentence pair is recursively divided into two through the recursively defined generative story.
sentence pairs is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Wang, Mengqiu and Che, Wanxiang and Manning, Christopher D.
Bilingual NER by Agreement
The inputs to our models are parallel sentence pairs (see Figure 1 for an example in English and
Bilingual NER by Agreement
Since we assume no bilingually annotated NER corpus is available, in order to get an estimate of the PMI scores, we first tag a collection of unannotated bilingual sentence pairs using the monolingual CRF taggers, and collect counts of aligned entity pairs from this auto- generated tagged data.
Error Analysis and Discussion
In this example, a snippet of a longer sentence pair is shown with NER and word alignment results.
Experimental Setup
After discarding sentences with no aligned counterpart, a total of 402 documents and 8,249 parallel sentence pairs were used for evaluation.
Experimental Setup
Word alignment evaluation is done over the sections of OntoNotes that have matching gold-standard word alignment annotations from GALE Y1Q4 dataset.2 This subset contains 288 documents and 3,391 sentence pairs .
Experimental Setup
An extra set of 5,000 unannotated parallel sentence pairs are used for
sentence pairs is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Celikyilmaz, Asli and Thint, Marcus and Huang, Zhiheng
Conclusions and Discussions
In this paper, we applied a graph-based SSL algorithm to improve the performance of QA task by exploiting unlabeled entailment relations between affirmed question and candidate sentence pairs .
Experiments
We also used a set of 340 QA-type sentence pairs from RTE02-03 and 195 pairs from RTE04 by converting the hypothesis sentences into question form to create additional set of q/a pairs.
Experiments
Each of these headline-candidate sentence pairs is used as additional unlabeled q/a pair.
Experiments
This is due to the fact that we establish two seperate entailment models for copula and non-copula q/a sentence pairs that enables extracting useful information and better representation of the specific data.
Feature Extraction for Entailment
(2) (QComp) Question component match features: The sentence component analysis is applied on both the affirmed question and the candidate sentence pairs to characterize their semantic components including subject(S), object(O), head (H) and modifiers(M).
Graph Based Semi-Supervised Learning for Entailment Ranking
Let each data point in X = {531, ..., an}, xi E SW represents information about a question and candidate sentence pair and Y = {3/1, be their output labels.
sentence pairs is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Hewavitharana, Sanjika and Mehay, Dennis and Ananthakrishnan, Sankaranarayanan and Natarajan, Prem
Corpus Data and Baseline SMT
The SMT parallel training corpus contains approximately 773K sentence pairs (7.3M English words).
Corpus Data and Baseline SMT
Our phrase-based decoder is similar to Moses (Koehn et al., 2007) and uses the phrase pairs and target LM to perform beam search stack decoding based on a standard log-linear model, the parameters of which were tuned with MERT (Och, 2003) on a held-out development set (3,534 sentence pairs , 45K words) using BLEU as the tuning metric.
Corpus Data and Baseline SMT
Finally, we evaluated translation performance on a separate, unseen test set (3,138 sentence pairs , 38K words).
sentence pairs is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Li, Haibo and Zheng, Jing and Ji, Heng and Li, Qi and Wang, Wen
Experiments
The training corpus includes 1,686,458 sentence pairs .
Experiments
For example, in the following sentence pair : “lfiifi‘ EF' , filfifflfiié/ET XE Efii fiiifilifl‘]... (in accordance with the tripartite agreement reached by China, Laos and the UNH CR on )...”, even though the tagger can successfully label “Edi/ET XE Efi/UNHCR” as an organization because it is a common Chinese name, English features based on previous GPE contexts still incorrectly predicted “UNH CR” as a GPE name.
Name-aware MT
Given a parallel sentence pair we first apply Giza++ (Och and Ney, 2003) to align words, and apply this join-
Name-aware MT
For example, given the following sentence pair:
Name-aware MT
Both sentence pairs are kept in the combined data to build the translation model.
sentence pairs is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Zollmann, Andreas and Vogel, Stephan
Experiments
The parallel training data comprises of 9.6M sentence pairs (206M Chinese and 228M English words).
Hard rule labeling from word classes
(2003) to provide us with a set of phrase pairs for each sentence pair in the training corpus, annotated with their respective start and end positions in the source and target sentences.
Hard rule labeling from word classes
Consider the target-tagged example sentence pair:
Hard rule labeling from word classes
Intuitively, the labeling of initial rules with tags marking the boundary of their target sides results in complex rules whose nonterminal occurrences impose weak syntactic constraints on the rules eligible for substitution in a PSCFG derivation: The left and right boundary word tags of the inserted rule’s target side have to match the respective boundary word tags of the phrase pair that was replaced by a nonterminal when the complex rule was created from a training sentence pair .
sentence pairs is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Setiawan, Hendra and Zhou, Bowen and Xiang, Bing and Shen, Libin
Experiments
To train our Two-Neighbor Orientation model, we select a subset of 5 million aligned sentence pairs .
Training
For each aligned sentence pair (F, E, N) in the training data, the training starts with the identification of the regions in the source sentences as anchors (A).
Two-Neighbor Orientation Model
Given an aligned sentence pair 9 = (F, E, N), let A(@) be all possible chunks that can be extracted from 9 according to: 2
Two-Neighbor Orientation Model
Figure 1: An aligned Chinese-English sentence pair .
Two-Neighbor Orientation Model
To be more concrete, let us consider an aligned sentence pair in Fig.
sentence pairs is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Guo, Weiwei and Diab, Mona
Evaluation for SS
The two data sets we know of for SS are: l. human-rated sentence pair similarity data set (Li et al., 2006) [L106]; 2. the Microsoft Research Paraphrase Corpus (Dolan et al., 2004) [MSR04].
Evaluation for SS
On the other hand, the MSR04 data set comprises a much larger set of sentence pairs : 4,076 training and 1,725 test pairs.
Evaluation for SS
This is not a problem per se, however the issue is that it is very strict in its assignment of a positive label, for example the following sentence pair as cited in (Islam and Inkpen, 2008) is rated not semantically similar: Ballmer has been vocal in the past warning that Linux is a threat to Microsoft.
Experiments and Results
Note that 7“ and p are much lower for 35 pairs set, since most of the sentence pairs have a very low similarity (the average similarity value is 0.065 in 35 pairs set and 0.367 in 30 pairs set) and SS models need to identify the tiny difference among them, thereby rendering this set much harder to predict.
Experiments and Results
We use the same parameter setting used for the L106 evaluation setting since both sets are human-rated sentence pairs (A = 20,10m = 0.01,K = 100).
sentence pairs is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Ganchev, Kuzman and Graça, João V. and Taskar, Ben
Adding agreement constraints
For each sentence pair instance x = (5,13), we find the posterior
Introduction
The typical pipeline for a machine translation (MT) system starts with a parallel sentence-aligned corpus and proceeds to align the words in every sentence pair .
Statistical word alignment
Figure 2 shows two examples of word alignment of a sentence pair .
Word alignment results
Figure 2 shows two machine-generated alignments of a sentence pair .
Word alignment results
The figure illustrates a problem known as garbage collection (Brown et al., 1993), where rare source words tend to align to many target words, since the probability mass of the rare word translations can be hijacked to fit the sentence pair .
sentence pairs is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Toutanova, Kristina and Suzuki, Hisami and Ruopp, Achim
Inflection prediction models
Figure 1 shows an example of an aligned English-Russian sentence pair : on the source (English) side, POS tags and word dependency structure are indicated by solid arcs.
Inflection prediction models
Figure 1: Aligned English—Russian sentence pair with syntactic and morphological annotation.
MT performance results
The judges were given the reference translations but not the source sentences, and were asked to classify each sentence pair into three categories: (1) the baseline system is better (score=-1), (2) the output of our model is better (score=l), or (3) they are of the same quality (score=0).
Machine translation systems and data
These aligned sentence pairs form the training data of the inflection models as well.
Machine translation systems and data
Dataset sent pairs word tokens (avg/sent) English-Russian
sentence pairs is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Sun, Jun and Zhang, Min and Tan, Chew Lim
Substructure Spaces for BTKs
Chinese I English # of Sentence pair 5000 Avg.
Substructure Spaces for BTKs
We randomly select 300 bilingual sentence pairs from the Chinese-English FBIS corpus with the length S 30 in both the source and target sides.
Substructure Spaces for BTKs
The selected plain sentence pairs are further parsed by Stanford parser (Klein and Manning, 2003) on both the English and Chinese sides.
sentence pairs is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Xiang, Bing and Luo, Xiaoqiang and Zhou, Bowen
Experimental Results
The MT training data includes 2 million sentence pairs from the parallel corpora released by
Experimental Results
We append a 300-sentence set, which we have human hand alignment available as reference, to the 2M training sentence pairs before running GIZA++.
Integrating Empty Categories in Machine Translation
Table 3 listed some of the most frequent English words aligned to *pro* or *PRO* in a Chinese-English parallel corpus with 2M sentence pairs .
Introduction
A sentence pair observed in the real data is shown in Figure 1 along with the word alignment obtained from an automatic word aligner, where the English subject pronoun
sentence pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Cohn, Trevor and Haffari, Gholamreza
Model
This way we don’t insist on a single tiling of phrases for a sentence pair , but explicitly model the set of hierarchically nested phrases as defined by an ITG derivation.
Model
nth—Ktjsas S —> s1g(t) —n§+bs sig(t) —> yield(t) l For every word pair, 6 / f in sentence pair , LU
Model
This process is then repeated for each sentence pair in the corpus in a random order.
Related Work
additional constraints on how phrase-pairs can be tiled to produce a sentence pair , and moreover, we seek to model the embedding of phrase-pairs in one another, something not considered by this prior work.
sentence pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Deng, Yonggang and Xu, Jia and Gao, Yuqing
A Generic Phrase Training Procedure
1: Train Model-1 and HMM word alignment models 2: for all sentence pair (e, f) do
A Generic Phrase Training Procedure
Phrase pair filtering is simply thresholding on the final score by comparing to the maximum within the sentence pair .
Features
Given a phrase pair in a sentence pair , there will be many generative paths that align the source phrase to the target phrase.
Features
Given a sentence pair , the basic assumption is that if the HMM word alignment model can align an English phrase well to a foreign phrase, the posterior distribution of the English phrase generating all foreign phrases on the other side is significantly biased.
sentence pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhang, Dongdong and Li, Mu and Duan, Nan and Li, Chi-Ho and Zhou, Ming
Experiments
We extracted both development and test data set from years of NIST Chinese-to-English evaluation data by filtering out sentence pairs not containing measure words.
Experiments
The development set is extracted from NIST evaluation data from 2002 to 2004, and the test set consists of sentence pairs from NIST evaluation data from 2005 to 2006.
Experiments
There are 759 testing cases for measure word generation in our test data consisting of 2746 sentence pairs .
Model Training and Application 3.1 Training
We ran GIZA++ (Och and Ney, 2000) on the training corpus in both directions with IBM model 4, and then applied the refinement rule described in (Koehn et al., 2003) to obtain a many-to-many word alignment for each sentence pair .
sentence pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhao, Shiqi and Wang, Haifeng and Liu, Ting and Li, Sheng
Abstract
Using the presented method, we extract over 1,000,000 pairs of paraphrase patterns from 2M bilingual sentence pairs , the precision of which exceeds 67%.
Conclusion
Experimental results show that the pivot approach is effective, which extracts over 1,000,000 pairs of paraphrase patterns from 2M bilingual sentence pairs .
Introduction
Using the proposed approach, we extract over 1,000,000 pairs of paraphrase patterns from 2M bilingual sentence pairs , the precision of which is above 67%.
Related Work
Thus the method acquired paraphrase patterns from sentence pairs that share comparable NEs.
sentence pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Ganchev, Kuzman and Gillenwater, Jennifer and Taskar, Ben
Approach
Figure 1(b) shows an aligned sentence pair example where dependencies are perfectly conserved across the alignment.
Approach
For example, in some sentence pair we might find 10 edges that have both end points aligned and can be transferred.
Experiments
Our basic model uses constraints of the form: the expected proportion of conserved edges in a sentence pair is at least 77 = 90%.1
Posterior Regularization
In grammar transfer, our basic constraint is of the form: the expected proportion of conserved edges in a sentence pair is at least 77 (the exact proportion we used was 0.9, which was determined using unlabeled data as described in Section 5).
sentence pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Pilehvar, Mohammad Taher and Jurgens, David and Navigli, Roberto
A Unified Semantic Representation
Commonly, semantic comparisons are between word pairs or sentence pairs that do not have their lexical content sense-annotated, despite the potential utility of sense annotation in making semantic comparisons.
Experiment 1: Textual Similarity
As our benchmark, we selected the recent SemEval-2012 task on Semantic Textual Similarity (STS), which was concerned with measuring the semantic similarity of sentence pairs .
Experiment 1: Textual Similarity
Each sentence pair in the datasets was given a score from 0 to 5 (low to high similarity) by human judges, with a high inter-annotator agreement of around 0.90 when measured using the Pearson correlation coefficient.
Experiment 1: Textual Similarity
Table 1 lists the number of sentence pairs in training and test portions of each dataset.
sentence pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Wu, Hua and Wang, Haifeng
Conclusion
9We used about 70k sentence pairs for CE model training, while Wang et a1.
Conclusion
(2008) used about 100k sentence pairs , a CE translation dictionary and more monolingual corpora for model training.
Experiments
For English-Spanish translation, we selected 400k sentence pairs from the Europarl corpus that are close to the English parts of both the BTEC CE corpus and the BTEC ES corpus.
Using RBMT Systems for Pivot Translation
Another way to use the synthetic multilingual corpus is to add the source-pivot or pivot-target sentence pairs in this corpus to the training data to rebuild the source-pivot or pivot-target SMT model.
sentence pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhang, Jiajun and Zong, Chengqing
Experiments
They are neither parallel nor comparable because we cannot even extract a small number of parallel sentence pairs from this monolingual data using the method of (Munteanu and Marcu, 2006).
Experiments
For the out-of-domain data, we build the phrase table and reordering table using the 2.08 million Chinese-to-English sentence pairs , and we use the SRILM toolkit (Stolcke, 2002) to train the 5-gram English language model with the target part of the parallel sentences and the Xinhua portion of the English Gigaword.
Phrase Pair Refinement and Parameterization
For each entry in LLR-lex, such as ([34], of), we can learn two kinds of information from the out-of-domain word-aligned sentence pairs : one is whether the target translation is before or after the translation of the preceding source-side word (Order); the other is whether the target translation is adjacent with the translation of the preceding source-side word (Adjacency).
Related Work
For the target-side monolingual data, they just use it to train language model, and for the source-side monolingual data, they employ a baseline (word-based SMT or phrase-based SMT trained with small-scale bitext) to first translate the source sentences, combining the source sentence and its target translation as a bilingual sentence pair, and then train a new phrase-base SMT with these pseudo sentence pairs .
sentence pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Yamangil, Elif and Shieber, Stuart M.
Evaluation
This corpus consists of 1370 sentence pairs that were manually created from transcribed Broadcast News stories.
Evaluation
a pattern of compression used many times in the BNC in sentence pairs such as “NPR’s Anne Gar-rels reports” / “Anne Garrels reports”.
Sentence compression
An example sentence pair , which we use as a running example, is the following:
The STSG Model
See Figure l for an example of how an STSG with these rules would operate in synchronously generating our example sentence pair .
sentence pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Mylonakis, Markos and Sima'an, Khalil
Experiments
For all language pairs we employ 200K and 400K sentence pairs for training, 2K for development and 2K for testing (single reference per source sentence).
Experiments
Table 1 presents the results for the baseline and our method for the 4 language pairs, for training sets of both 200K and 400K sentence pairs .
Experiments
In addition, increasing the size of the training data from 200K to 400K sentence pairs widens the performance margin between the baseline and our system, in some cases considerably.
sentence pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming
Experiments and Results
The training data contains 81k sentence pairs , 655K Chinese words and 806K English words.
Model Training
The training samples for RAE are phrase pairs {31, 32} in translation table, where 31 and 32 can form a continuous partial sentence pair in the training data.
Model Training
Forced decoding performs sentence pair segmentation using the same translation system as decoding.
Model Training
For each sentence pair in the training data, SMT decoder is applied to the source side, and any candidate which is not the partial substring of the target sentence is removed from the n-best list during decoding.
sentence pairs is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Narayan, Shashi and Gardent, Claire
Experiments
PWKP contains 108016 / 114924 comple)dsimple sentence pairs .
Simplification Framework
(2010); and build training graphs (Figure 2) from the pair of complex and simple sentence pairs in the training data.
Simplification Framework
Each training graph represents a complex-simple sentence pair and consists of two types of nodes: major nodes (M-nodes) and operation nodes (O-nodes).
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Cai, Jingsheng and Utiyama, Masao and Sumita, Eiichiro and Zhang, Yujie
Conclusion
Moreover, our dependency-based pre-ordering rule set substantially decreased the time for applying pre-ordering rules about 60% compared with WR07, on the training set of 1M sentences pairs .
Experiments
Our development set was the official NIST MT evaluation data from 2002 to 2005, consisting of 4476 Chinese-English sentences pairs .
Experiments
Our test set was the NIST 2006 MT evaluation data, consisting of 1664 sentence pairs .
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wang, Xiaolin and Utiyama, Masao and Finch, Andrew and Sumita, Eiichiro
Complexity Analysis
The computational complexity of our method is linear in the number of iterations, the size of the corpus, and the complexity of calculating the expectations on each sentence or sentence pair .
Complexity Analysis
This was verified by experiments on a corpus of 1-million sentence pairs on which traditional MCMC approaches would struggle (Xu et al., 2008).
Methods
(.7-",E) a bilingual sentence pair
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Visweswariah, Karthik and Khapra, Mitesh M. and Ramanathan, Ananthakrishnan
Introduction
Specifically, we show that we can significantly improve reordering performance by using a large number of sentence pairs for which manual word alignments are not available.
Reordering model
In this paper we focus on the case where in addition to using a relatively small number of manual word aligned sentences to derive the reference permutations 77* used to train our model, we would like to use more abundant but noisier machine aligned sentence pairs .
Results and Discussions
We use H to refer to the manually word aligned data and U to refer to the additional sentence pairs for which manual word alignments are not available.
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
liu, lemao and Watanabe, Taro and Sumita, Eiichiro and Zhao, Tiejun
Introduction
For the Chinese-to-English task, the training data is the FBIS corpus (news domain) with about 240k: sentence pairs ; the development set is the NIST02 evaluation data; the development test set is NIST05; and the test datasets are NIST06, and NIST08.
Introduction
For the Japanese-to-English task, the training data with 300k: sentence pairs is from the NTCIR-patent task (Fujii et al., 2010); the development set, development test set, and two test sets are averagely extracted from a given development set with 4000 sentences, and these four datasets are called testl, test2, test3 and test4, respectively.
Introduction
We run GIZA++ (Och and Ney, 2000) on the training corpus in both directions (Koehn et al., 2003) to obtain the word alignment for each sentence pair .
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kauchak, David
Introduction
Table 1 shows the n-gram overlap proportions in a sentence aligned data set of 137K sentence pairs from aligning Simple English Wikipedia and English Wikipedia articles (Coster and Kauchak, 2011a).1 The data highlights two conflicting views: does the benefit of additional data outweigh the problem of the source of the data?
Introduction
On the other hand, there is still only modest overlap between the sentences for longer n-grams, particularly given that the corpus is sentence-aligned and that 27% of the sentence pairs in this aligned data set are identical.
Why Does Unsimplified Data Help?
The resulting data set contains 150K aligned simple-normal sentence pairs .
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Goto, Isao and Utiyama, Masao and Sumita, Eiichiro and Tamura, Akihiro and Kurohashi, Sadao
Experiment
So approximately 2.05 million sentence pairs consisting of approximately 54 million
Experiment
And approximately 0.49 million sentence pairs consisting of 14.9 million Chinese tokens whose lexicon size was 169k and 16.3 million English tokens whose lexicon size was 240k were used for CE.
Experiment
Our distortion model was trained as follows: We used 0.2 million sentence pairs and their word alignments from the data used to build the translation model as the training data for our distortion models.
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Chen, Boxing and Kuhn, Roland and Foster, George
Experiments
Most training subcorpora consist of parallel sentence pairs .
Introduction
The resulting bilingual sentence pairs are then used as additional training data (Ueffing et al., 2007; Chen et al., 2008; Schwenk, 2008; Bertoldi and Federico, 2009).
Introduction
Data selection approaches (Zhao et al., 2004; Hildebrand et al., 2005; Lu et al., 2007; Moore and Lewis, 2010; Axelrod et al., 2011) search for bilingual sentence pairs that are similar to the in-domain “dev” data, then add them to the training data.
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Braune, Fabienne and Seemann, Nina and Quernheim, Daniel and Maletti, Andreas
Theoretical Model
In this manner we obtain sentence pairs like the one shown in Figure 3.
Theoretical Model
To these sentence pairs we apply the rule extraction method of Maletti (2011).
Theoretical Model
The rules extracted from the sentence pair of Figure 3 are shown in Figure 4.
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Li, Junhui and Tu, Zhaopeng and Zhou, Guodong and van Genabith, Josef
Experiments
We train our model on a dataset with ~1.5M sentence pairs from the LDC dataset.2 We use the 2002 NIST MT evaluation test data (878 sentence pairs) as the development data, and the 2003, 2004, 2005, 2006-news NIST MT evaluation test data (919, 1788, 1082, and 616 sentence pairs , respectively) as the test data.
Head-Driven HPB Translation Model
For rule extraction, we first identify initial phrase pairs on word-aligned sentence pairs by using the same criterion as most phrase-based translation models (Och and Ney, 2004) and Chiang’s HPB model (Chiang, 2005; Chiang, 2007).
Introduction
Figure 1: An example word alignment for a Chinese-English sentence pair with the dependency parse tree for the Chinese sentence.
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Ravi, Sujith and Knight, Kevin
Introduction
Starting with the classic IBM work (Brown et al., 1993), training has been viewed as a maximization problem involving hidden word alignments (a) that are assumed to underlie observed sentence pairs
Machine Translation as a Decipherment Task
(1993) provide an efficient algorithm for training IBM Model 3 translation model when parallel sentence pairs are available.
Machine Translation as a Decipherment Task
We see that deciphering with 10k monolingual Spanish sentences yields the same performance as training with around 200-500 parallel English/Spanish sentence pairs .
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Jiang, Wenbin and Liu, Qun
Experiments
It contains 239K sentence pairs with about 6.9M/8.9M words in Chi-neseflEnglish.
Experiments
The alignment matrixes for sentence pairs are generated according to (Liu et al., 2009).
Projected Classification Instance
Suppose a bilingual sentence pair , composed of a source sentence e and its target translation f. ye is the parse tree of the source sentence.
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Setiawan, Hendra and Kan, Min Yen and Li, Haizhou and Resnik, Philip
Experimental Setup
We trained the system on the NIST MT06 Eval corpus excluding the UN data (approximately 900K sentence pairs ).
Experimental Setup
We trained the system on a subset of 950K sentence pairs from the NIST MT08 training data, selected by
Experimental Setup
The subsampling algorithm selects sentence pairs from the training data in a way that seeks reasonable representation for all n-grams appearing in the test set.
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Pado, Sebastian and Galley, Michel and Jurafsky, Dan and Manning, Christopher D.
EXpt. 1: Predicting Absolute Scores
Each language consists of 1500—2800 sentence pairs produced by 7—15 MT systems.
EXpt. 1: Predicting Absolute Scores
RTER has a rather flat learning curve that climbs to within 2 points of the final correlation value for 20% of the training set (about 400 sentence pairs ).
Textual Entailment vs. MT Evaluation
The average total runtime per sentence pair is 5 seconds on an AMD 2.6GHz Opteron core — efficient enough to perform regular evaluations on development and test sets.
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Liu, Yang and Lü, Yajuan and Liu, Qun
Experiments
Table 4: Comparison of rule extraction time (seconds/ 1000 sentence pairs ) and decoding time (sec-ond/sentence)
Experiments
Table 4 gives the rule extraction time (seconds/ 1000 sentence pairs ) and decoding time (sec-ond/sentence) with varying pruning thresholds.
Experiments
the new training corpus contained about 260K sentence pairs with 7.39M Chinese words and 9.41M English words.
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Cherry, Colin
Cohesive Phrasal Output
Previous approaches to measuring the cohesion of a sentence pair have worked with a word alignment (Fox, 2002; Lin and Cherry, 2003).
Experiments
We test our cohesion-enhanced Moses decoder trained using 688K sentence pairs of Europarl French-English data, provided by the SMT 2006 Shared Task (Koehn and Monz, 2006).
Experiments
Comparing the baseline with and without the soft cohesion constraint, we see that cohesion has only a modest effect on BLEU, when measured on all sentence pairs , with improvements ranging between 0.2 and 0.5 absolute points.
sentence pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: