SciSurf: Index of "parallel corpus" in Proc. ACL 2014

Index of papers in Proc. ACL 2014 that mention

parallel corpus

Seen in text as:

parallel corpus (24)

Seen in 24 sentences in 6 papers.

1. Translation Assistance by Translation of L1 Fragments in an L2 Context

van Gompel, Maarten and van den Bosch, Antal

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Data preparation	We start with a parallel corpus that is tokenised for both L1 and L2.
Data preparation	The parallel corpus is randomly sampled into two large and equally-sized parts.
Data preparation	1. using phrase-translation table T and parallel corpus split 8
Experiments & Results	The data for our experiments were drawn from the Europarl parallel corpus (Koehn, 2005) from which we extracted two sets of 200, 000 sentence pairs each for several language pairs.

parallel corpus is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

2. Polylingual Tree-Based Topic Models for Translation Domain Adaptation

Hu, Yuening and Zhai, Ke and Eidelman, Vladimir and Boyd-Graber, Jordan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Dataset and SMT Pipeline We use the NIST MT Chinese-English parallel corpus (NIS T), excluding non-UN and non-HK Hansards portions as our training dataset.
Polylingual Tree-based Topic Models	In addition, we extract the word alignments from aligned sentences in a parallel corpus .
Topic Models for Machine Translation	For a parallel corpus of aligned source and target sentences (.73, 5 a phrase f E .7: is translated to a phrase 6 E 5 according to a distribution pw(é\| f One popular method to estimate the probability
Topic Models for Machine Translation	Our contribution are topics that capture multilingual information and thus better capture the domains in the parallel corpus .

parallel corpus is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

3. Effective Selection of Translation Model Training Data

Liu, Le and Hong, Yu and Liu, Hao and Wang, Xing and Yao, Jianmin

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Most current data selection methods solely use language models trained on a small scale in-domain data to select domain-relevant sentence pairs from general-domain parallel corpus .
Experiments	The reason is that large scale parallel corpus maintains more bilingual knowledge and language phenomenon, while small in-domain corpus encounters data sparse problem, which degrades the translation performance.
Experiments	Results of the systems trained on only a subset of the general-domain parallel corpus .
Introduction	For this, an effective approach is to automatically select and eXpand domain-specific sentence pairs from large scale general-domain parallel corpus .

parallel corpus is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

4. Semantic Parsing via Paraphrasing

Berant, Jonathan and Liang, Percy

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	We use two complementary paraphrase models: an association model based on aligned phrase pairs extracted from a monolingual parallel corpus , and a vector space model, which represents each utterance as a vector and learns a similarity score between them.
Introduction	(2013) presented a QA system that maps questions onto simple queries against Open IE extractions, by learning paraphrases from a large monolingual parallel corpus , and performing a single paraphrasing step.
Model overview	Our framework accommodates any paraphrasing method, and in this paper we propose an association model that learns to associate natural language phrases that co-occur frequently in a monolingual parallel corpus , combined with a vector space model, which learns to score the similarity between vector representations of natural language utterances (Section 5).

parallel corpus is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

5. Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora

Qian, Longhua and Hui, Haotian and Hu, Ya'nan and Zhou, Guodong and Zhu, Qiaoming

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Instead of using a parallel corpus , labeled and unlabeled instances in one language are translated into ones in the other language and all instances in both languages are then fed into a bilingual active learning engine as pseudo parallel corpora.
Abstract	Instead of using a parallel corpus which should have entity/relation alignment information and is thus difficult to obtain, this paper employs an off-the-shelf machine translator to translate both labeled and unlabeled instances from one language into the other language, forming pseudo parallel corpora.
Abstract	Our lexicon is derived from the FBIS parallel corpus (#LDC2003E14), which is widely used in machine translation between English and Chinese.

parallel corpus is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

6. Graph-based Semi-Supervised Learning of Translation Models from Monolingual Data

Saluja, Avneesh and Hassan, Hany and Toutanova, Kristina and Quirk, Chris

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Generation & Propagation	Our goal is to obtain translation distributions for source phrases that are not present in the phrase table extracted from the parallel corpus .
Generation & Propagation	The label space is thus the phrasal translation inventory, and like the source side it can also be represented in terms of a graph, initially consisting of target phrase nodes from the parallel corpus .
Generation & Propagation	Thus, the target phrase inventory from the parallel corpus may be inadequate for unlabeled instances.

parallel corpus is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

language model (18)
BLEU (15)
bigram (13)