Index of papers in Proc. ACL 2013 that mention

sentence pairs

Seen in text as:

sentence pairs (91)
sentence pair (48)

Seen in 137 sentences in 24 papers.

Ling, Wang and Xiang, Guang and Dyer, Chris and Black, Alan and Trancoso, Isabel

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The y-aXis denotes the scores for each metric, and the X-aXis denotes the percentage of the highest scoring sentence pairs that are kept.
Experiments	However, translation models are generally robust to such kinds of errors and can learn good translations even in the presence of imperfect sentence pairs .
Experiments	Example sentence pairs .
Parallel Data Extraction	In this process, lexical tables for EN-ZH language pair used by Model 1 were built using the FBIS dataset (LDC2003E14) for both directions, a corpus of 300K sentence pairs from the news domain.
Parallel Data Extraction	Likewise, for the EN-AR language pair, we use a fraction of the NIST dataset, by removing the data originated from UN, which leads to approximately 1M sentence pairs .
Parallel Segment Retrieval	This is obviously not our goal, since we would not obtain any useful sentence pairs .
Parallel Segment Retrieval	It is highest for segmentations that cover all the words in the document (this is desirable since there are many sentence pairs that can be extracted but we want to find the largest sentence pair in the document).

sentence pairs is mentioned in 17 sentences in this paper.

Topics mentioned in this paper:

2. Word Alignment Modeling with Context Dependent Deep Neural Network

Yang, Nan and Liu, Shujie and Li, Mu and Zhou, Ming and Yu, Nenghai

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

DNN for word alignment	Given a sentence pair (e, f), HMM word alignment takes the following form:
DNN for word alignment	To decode our model, the lexical translation scores are computed for each source-target word pair in the sentence pair , which requires going through the neural network (\|e\| >< \|f times; after that, the forward-backward algorithm can be used to find the viterbi path as in the classic HMM model.
Experiments and Results	We use the manually aligned Chinese-English alignment corpus (Haghighi et al., 2009) which contains 491 sentence pairs as test set.
Experiments and Results	Our parallel corpus contains about 26 million unique sentence pairs in total which are mined from web.
Training	1In practice, the number of nonzero parameters in classic HMM model would be much smaller, as many words do not co-occur in bilingual sentence pairs .
Training	our model from raw sentence pairs , they are too computational demanding as the lexical translation probabilities must be computed from neural networks.
Training	Hence, we opt for a simpler supervised approach, which learns the model from sentence pairs with word alignment.

sentence pairs is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

3. Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation

Nguyen, ThuyLinh and Vogel, Stephan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment Results	The Arabic-English system was trained from 264K sentence pairs with true case English.
Experiment Results	The Chinese-English system was trained on FBIS corpora of 384K sentence pairs , the English corpus is lower case.
Experiment Results	The systems were trained on 1.8 million sentence pairs using the Europarl corpora.
Introduction	From this Hiero derivation, we have a segmentation of the sentence pairs into phrase pairs according to the word alignments, as shown on the left side of Figure 1.
Phrasal-Hiero Model	In the rule X —> Je X1 [6 Francais ; I X1 french extract from sentence pair in Figure l, the phrase le Frangais connects to the phrase french because the French word Frangais aligns with the English word french even though le is unaligned.
Phrasal-Hiero Model	Figure 2: Alignment of a sentence pair .
Phrasal-Hiero Model	For exam-pleintheruler4 = X —> je X1 le X2 ; 2' X1 X2 extracted from the sentence pair in Figure 2, the phrase le is not aligned.

sentence pairs is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

4. Dirt Cheap Web-Scale Parallel Text from the Common Crawl

Smith, Jason R. and Saint-Amand, Herve and Plamada, Magdalena and Koehn, Philipp and Callison-Burch, Chris and Lopez, Adam

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Sentence Filtering Since we do not perform any boilerplate removal in earlier steps, there are many sentence pairs produced by the pipeline which contain menu items or other bits of text which are not useful to an SMT system.
Abstract	To measure this, we conducted a manual analysis of 200 randomly selected sentence pairs for each of three language pairs.
Abstract	Table 2: Manual evaluation of precision (by sentence pair ) on the extracted parallel data for Spanish, French, and German (paired with English).

sentence pairs is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

5. Non-Monotonic Sentence Alignment via Semisupervised Learning

Quan, Xiaojun and Kit, Chunyu and Song, Yan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Methodology 2.1 The Problem	Its output is then double-checked and corrected by two experts in bilingual studies, resulting in a data set of 1747 1-1 and 70 1-0 or 0-1 sentence pairs .
Methodology 2.1 The Problem	\| i \| i \| i \| 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Similarity of English sentence pair
Methodology 2.1 The Problem	The horizontal axis is the similarity of English sentence pairs and the vertical is the similarity of the corresponding pairs in Chinese.

sentence pairs is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

6. Advancements in Reordering Models for Statistical Machine Translation

Feng, Minwei and Peter, Jan-Thorsten and Ney, Hermann

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Comparative Study	The interpretation is that given the sentence pair ( f17 , 6;) and its alignment, the correct translation order is 631—762—163,€3—f1,64—f4765—f4,€6—f6—f7767—f5-Notice the bilingual units have been ordered according to the target side, as the decoder writes the translation in a left-to-right way.
Comparative Study	After the operation in Figure 4 was done for all bilingual sentence pairs , we get a decoding sequence corpus.
Experiments	Firstly, we delete the sentence pairs if the source sentence length is one.
Experiments	Secondly, we delete the sentence pairs if the source sentence contains more than three contiguous unaligned words.
Experiments	When this happens, the sentence pair is usually low quality hence not suitable for learning.
Tagging-style Reordering Model	The transformation in Figure 1 is conducted for all the sentence pairs in the bilingual training corpus.
Tagging-style Reordering Model	During the search, a sentence pair ( 1‘], (if) will be formally splitted into a segmentation Sff which consists of K phrase pairs.

sentence pairs is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

CRFs (19)
LM (14)
BLEU (8)

7. Hierarchical Phrase Table Combination for Machine Translation

Zhu, Conghui and Watanabe, Taro and Sumita, Eiichiro and Zhao, Tiejun

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	Table l: The sentence pairs used in each data set.
Experiment	The parameter ’samps’ is set to 5, which indicates 5 samples are generated for a sentence pair .
Experiment	However, if most domains are similar (FBIS data set) or if there are enough parallel sentence pairs (NIST data set) in each domain, then the translation performances are almost similar even with the opposite integrating orders.
Introduction	Since SMT systems trend to employ very large scale training data for translation knowledge extraction, updating several sentence pairs each time will be annihilated in the existing corpus.
Phrase Pair Extraction with Unsupervised Phrasal ITGs	ITG is a synchronous grammar formalism which analyzes bilingual text by introducing inverted rules, and each ITG derivation corresponds to the alignment of a sentence pair (Wu, 1997).
Phrase Pair Extraction with Unsupervised Phrasal ITGs	Figure 1 (b) illustrates an example of the phrasal ITG derivation for word alignment in Figure l (a) in which a bilingual sentence pair is recursively divided into two through the recursively defined generative story.

sentence pairs is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

8. Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment

Schoenemann, Thomas

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	Table 2: Evaluation of phrase-based translation from German to English with the obtained alignments (for 100.000 sentence pairs ).
Introduction	The downside of our method is its resource consumption, but still we present results on corpora with 100.000 sentence pairs .
See e. g. the author’s course notes (in German), currently	However, since we approximate expectations from the move and swap matrices, and hence by (9((1 + J) - J) alignments per sentence pair , in the end we get a polynomial number of terms.
See e. g. the author’s course notes (in German), currently	We use MOSES with a 5-gram language model (trained on 500.000 sentence pairs ) and the standard setup in the MOSES Experiment Management System: training is run in both directions, the alignments are combined using diag—grow—final—and (Och and Ney, 2003) and the parameters of MOSES are optimized on 750 development sentences.
Training the New Variants	sentence pairs 3 = l, .
Training the New Variants	This task is also needed for the actual task of word alignment (annotating a given sentence pair with an alignment).

sentence pairs is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

9. Joint Word Alignment and Bilingual Named Entity Recognition Using Dual Decomposition

Wang, Mengqiu and Che, Wanxiang and Manning, Christopher D.

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Bilingual NER by Agreement	The inputs to our models are parallel sentence pairs (see Figure 1 for an example in English and
Bilingual NER by Agreement	Since we assume no bilingually annotated NER corpus is available, in order to get an estimate of the PMI scores, we first tag a collection of unannotated bilingual sentence pairs using the monolingual CRF taggers, and collect counts of aligned entity pairs from this auto- generated tagged data.
Error Analysis and Discussion	In this example, a snippet of a longer sentence pair is shown with NER and word alignment results.
Experimental Setup	After discarding sentences with no aligned counterpart, a total of 402 documents and 8,249 parallel sentence pairs were used for evaluation.
Experimental Setup	Word alignment evaluation is done over the sections of OntoNotes that have matching gold-standard word alignment annotations from GALE Y1Q4 dataset.2 This subset contains 288 documents and 3,391 sentence pairs .
Experimental Setup	An extra set of 5,000 unannotated parallel sentence pairs are used for

sentence pairs is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

NER (41)
word alignment (20)
CRF (15)

10. Integrating Translation Memory into Phrase-Based Machine Translation during Decoding

Wang, Kun and Zong, Chengqing and Su, Keh-Yih

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We randomly selected a development set and a test set, and then the remaining sentence pairs are for training set.
Experiments	The remaining 28.3% of the sentence pairs are thus not adopted for generating training samples.
Introduction	They first determine whether the extracted TM sentence pair should be adopted or not.
Problem Formulation	is the final translation; [tm_s,tm_t,tm_f,s_a,tm_a] are the associated information of the best TM sentence-pair; tm_s and tm_t denote the corresponding TM sentence pair ; tm_f denotes its associated fuzzy match score (from 0.0 to 1.0); 8_a is the editing operations between tm_8 and s; and tm_a denotes the word alignment between tm_s and tmi.
Problem Formulation	mula (3) is just the typical phrase-based SMT model, and the second factor P(Mk\|Lk, 2:) (to be specified in the Section 3) is the information derived from the TM sentence pair .
Problem Formulation	useful information from the best TM sentence pair to guide SMT decoding.

sentence pairs is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

TER (14)
BLEU (11)
SMT system (11)

11. A Markov Model of Machine Translation using Non-parametric Bayesian Inference

Feng, Yang and Cohn, Trevor

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Here the training data consists of the non-UN portions and non-HK Hansards portions of the NIST training corpora distributed by the LDC, totalling 303k sentence pairs with 8m and 9.4m words of Chinese and English, respectively.
Experiments	Overall there are 276k sentence pairs and 8.21m and 8.97m words in Arabic and English, respectively.
Gibbs Sampling	Specifically we seek to infer the latent sequence of translation decisions given a corpus of sentence pairs .
Gibbs Sampling	It visits each sentence pair in the corpus in a random order and resamples the alignments for each target position as follows.
Model	Therefore, we introduce fertility to denote the number of target positions a source word is linked to in a sentence pair .
Model	where gbj is the fertility of source word fj in the sentence pair < fi],e{ > and p58 is the basic model defined in Eq.

sentence pairs is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

12. Name-aware Machine Translation

Li, Haibo and Zheng, Jing and Ji, Heng and Li, Qi and Wang, Wen

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The training corpus includes 1,686,458 sentence pairs .
Experiments	For example, in the following sentence pair : “lfiifi‘ EF' , filfifflfiié/ET XE Efii fiiifilifl‘]... (in accordance with the tripartite agreement reached by China, Laos and the UNH CR on )...”, even though the tagger can successfully label “Edi/ET XE Efi/UNHCR” as an organization because it is a common Chinese name, English features based on previous GPE contexts still incorrectly predicted “UNH CR” as a GPE name.
Name-aware MT	Given a parallel sentence pair we first apply Giza++ (Och and Ney, 2003) to align words, and apply this join-
Name-aware MT	For example, given the following sentence pair:
Name-aware MT	Both sentence pairs are kept in the combined data to build the translation model.

sentence pairs is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

BLEU (19)
word alignment (17)
LM (12)

13. Incremental Topic-Based Translation Model Adaptation for Conversational Spoken Language Translation

Hewavitharana, Sanjika and Mehay, Dennis and Ananthakrishnan, Sankaranarayanan and Natarajan, Prem

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Corpus Data and Baseline SMT	The SMT parallel training corpus contains approximately 773K sentence pairs (7.3M English words).
Corpus Data and Baseline SMT	Our phrase-based decoder is similar to Moses (Koehn et al., 2007) and uses the phrase pairs and target LM to perform beam search stack decoding based on a standard log-linear model, the parameters of which were tuned with MERT (Och, 2003) on a held-out development set (3,534 sentence pairs , 45K words) using BLEU as the tuning metric.
Corpus Data and Baseline SMT	Finally, we evaluated translation performance on a separate, unseen test set (3,138 sentence pairs , 38K words).

sentence pairs is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

14. Two-Neighbor Orientation Model with Cross-Boundary Global Contexts

Setiawan, Hendra and Zhou, Bowen and Xiang, Bing and Shen, Libin

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	To train our Two-Neighbor Orientation model, we select a subset of 5 million aligned sentence pairs .
Training	For each aligned sentence pair (F, E, N) in the training data, the training starts with the identification of the regions in the source sentences as anchors (A).
Two-Neighbor Orientation Model	Given an aligned sentence pair 9 = (F, E, N), let A(@) be all possible chunks that can be extracted from 9 according to: 2
Two-Neighbor Orientation Model	Figure 1: An aligned Chinese-English sentence pair .
Two-Neighbor Orientation Model	To be more concrete, let us consider an aligned sentence pair in Fig.

sentence pairs is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

15. Enlisting the Ghost: Modeling Empty Categories for Machine Translation

Xiang, Bing and Luo, Xiaoqiang and Zhou, Bowen

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Results	The MT training data includes 2 million sentence pairs from the parallel corpora released by
Experimental Results	We append a 300-sentence set, which we have human hand alignment available as reference, to the 2M training sentence pairs before running GIZA++.
Integrating Empty Categories in Machine Translation	Table 3 listed some of the most frequent English words aligned to pro or PRO in a Chinese-English parallel corpus with 2M sentence pairs .
Introduction	A sentence pair observed in the real data is shown in Figure 1 along with the word alignment obtained from an automatic word aligner, where the English subject pronoun

sentence pairs is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

16. Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation

Zhang, Jiajun and Zong, Chengqing

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	They are neither parallel nor comparable because we cannot even extract a small number of parallel sentence pairs from this monolingual data using the method of (Munteanu and Marcu, 2006).
Experiments	For the out-of-domain data, we build the phrase table and reordering table using the 2.08 million Chinese-to-English sentence pairs , and we use the SRILM toolkit (Stolcke, 2002) to train the 5-gram English language model with the target part of the parallel sentences and the Xinhua portion of the English Gigaword.
Phrase Pair Refinement and Parameterization	For each entry in LLR-lex, such as ([34], of), we can learn two kinds of information from the out-of-domain word-aligned sentence pairs : one is whether the target translation is before or after the translation of the preceding source-side word (Order); the other is whether the target translation is adjacent with the translation of the preceding source-side word (Adjacency).
Related Work	For the target-side monolingual data, they just use it to train language model, and for the source-side monolingual data, they employ a baseline (word-based SMT or phrase-based SMT trained with small-scale bitext) to first translate the source sentences, combining the source sentence and its target translation as a bilingual sentence pair, and then train a new phrase-base SMT with these pseudo sentence pairs .

sentence pairs is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

17. Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity

Pilehvar, Mohammad Taher and Jurgens, David and Navigli, Roberto

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Unified Semantic Representation	Commonly, semantic comparisons are between word pairs or sentence pairs that do not have their lexical content sense-annotated, despite the potential utility of sense annotation in making semantic comparisons.
Experiment 1: Textual Similarity	As our benchmark, we selected the recent SemEval-2012 task on Semantic Textual Similarity (STS), which was concerned with measuring the semantic similarity of sentence pairs .
Experiment 1: Textual Similarity	Each sentence pair in the datasets was given a score from 0 to 5 (low to high similarity) by human judges, with a high inter-annotator agreement of around 0.90 when measured using the Pearson correlation coefficient.
Experiment 1: Textual Similarity	Table 1 lists the number of sentence pairs in training and test portions of each dataset.

sentence pairs is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

18. An Infinite Hierarchical Bayesian Model of Phrasal Translation

Cohn, Trevor and Haffari, Gholamreza

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Model	This way we don’t insist on a single tiling of phrases for a sentence pair , but explicitly model the set of hierarchically nested phrases as defined by an ITG derivation.
Model	nth—Ktjsas S —> s1g(t) —n§+bs sig(t) —> yield(t) l For every word pair, 6 / f in sentence pair , LU
Model	This process is then repeated for each sentence pair in the corpus in a random order.
Related Work	additional constraints on how phrase-pairs can be tiled to produce a sentence pair , and moreover, we seek to model the embedding of phrase-pairs in one another, something not considered by this prior work.

sentence pairs is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

19. Cut the noise: Mutually reinforcing reordering and alignments for improved machine translation

Visweswariah, Karthik and Khapra, Mitesh M. and Ramanathan, Ananthakrishnan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Specifically, we show that we can significantly improve reordering performance by using a large number of sentence pairs for which manual word alignments are not available.
Reordering model	In this paper we focus on the case where in addition to using a relatively small number of manual word aligned sentences to derive the reference permutations 77* used to train our model, we would like to use more abundant but noisier machine aligned sentence pairs .
Results and Discussions	We use H to refer to the manually word aligned data and U to refer to the additional sentence pairs for which manual word alignments are not available.

sentence pairs is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

20. Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation

Braune, Fabienne and Seemann, Nina and Quernheim, Daniel and Maletti, Andreas

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Theoretical Model	In this manner we obtain sentence pairs like the one shown in Figure 3.
Theoretical Model	To these sentence pairs we apply the rule extraction method of Maletti (2011).
Theoretical Model	The rules extracted from the sentence pair of Figure 3 are shown in Figure 4.

sentence pairs is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

21. Additive Neural Networks for Statistical Machine Translation

liu, lemao and Watanabe, Taro and Sumita, Eiichiro and Zhao, Tiejun

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	For the Chinese-to-English task, the training data is the FBIS corpus (news domain) with about 240k: sentence pairs ; the development set is the NIST02 evaluation data; the development test set is NIST05; and the test datasets are NIST06, and NIST08.
Introduction	For the Japanese-to-English task, the training data with 300k: sentence pairs is from the NTCIR-patent task (Fujii et al., 2010); the development set, development test set, and two test sets are averagely extracted from a given development set with 4000 sentences, and these four datasets are called testl, test2, test3 and test4, respectively.
Introduction	We run GIZA++ (Och and Ney, 2000) on the training corpus in both directions (Koehn et al., 2003) to obtain the word alignment for each sentence pair .

sentence pairs is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

22. Improving Text Simplification Language Modeling Using Unsimplified Text Data

Kauchak, David

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Table 1 shows the n-gram overlap proportions in a sentence aligned data set of 137K sentence pairs from aligning Simple English Wikipedia and English Wikipedia articles (Coster and Kauchak, 2011a).1 The data highlights two conflicting views: does the benefit of additional data outweigh the problem of the source of the data?
Introduction	On the other hand, there is still only modest overlap between the sentences for longer n-grams, particularly given that the corpus is sentence-aligned and that 27% of the sentence pairs in this aligned data set are identical.
Why Does Unsimplified Data Help?	The resulting data set contains 150K aligned simple-normal sentence pairs .

sentence pairs is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

23. Distortion Model Considering Rich Context for Statistical Machine Translation

Goto, Isao and Utiyama, Masao and Sumita, Eiichiro and Tamura, Akihiro and Kurohashi, Sadao

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	So approximately 2.05 million sentence pairs consisting of approximately 54 million
Experiment	And approximately 0.49 million sentence pairs consisting of 14.9 million Chinese tokens whose lexicon size was 169k and 16.3 million English tokens whose lexicon size was 240k were used for CE.
Experiment	Our distortion model was trained as follows: We used 0.2 million sentence pairs and their word alignments from the data used to build the translation model as the training data for our distortion models.

sentence pairs is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

24. Vector Space Model for Adaptation in Statistical Machine Translation

Chen, Boxing and Kuhn, Roland and Foster, George

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Most training subcorpora consist of parallel sentence pairs .
Introduction	The resulting bilingual sentence pairs are then used as additional training data (Ueffing et al., 2007; Chen et al., 2008; Schwenk, 2008; Bertoldi and Federico, 2009).
Introduction	Data selection approaches (Zhao et al., 2004; Hildebrand et al., 2005; Lu et al., 2007; Moore and Lewis, 2010; Axelrod et al., 2011) search for bilingual sentence pairs that are similar to the in-domain “dev” data, then add them to the training data.

sentence pairs is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: