Index of papers in Proc. ACL 2011 that mention

parallel data

Seen in text as:

parallel data (43)
Parallel Data (3)

Seen in 46 sentences in 4 papers.

1. Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Joint Model with Unlabeled Parallel Text	sentiment) bilingual (in L1 and L2) parallel data U that are defined as follows.
A Joint Model with Unlabeled Parallel Text	where v E {1,2} denotes L1 or L2; the first term on the right-hand side is the likelihood of labeled data for both D1 and D2; and the second term is the likelihood of the unlabeled parallel data U.
A Joint Model with Unlabeled Parallel Text	However, there could be considerable noise in real-world parallel data , i.e.
Abstract	We present a novel approach for joint bilingual sentiment classification at the sentence level that augments available labeled data in each language with unlabeled parallel data .
Experimental Setup 4.1 Data Sets and Preprocessing	We also try to remove neutral sentences from the parallel data since they can introduce noise into our model, which deals only with positive and negative examples.
Experimental Setup 4.1 Data Sets and Preprocessing	Co-Training with SVMs (Co-SVM): This method applies SVM-based co-training given both the labeled training data and the unlabeled parallel data following Wan (2009).
Introduction	We furthermore find that improvements, albeit smaller, are obtained when the parallel data is replaced with a pseudo-parallel (i.e.
Results and Analysis	8 By making use of the unlabeled parallel data , our proposed approach improves the accuracy, compared to MaXEnt, by 8.12% (or 33.27% error reduction) on English and 3.44% (or 16.92% error reduction) on Chinese in the first setting, and by 5.07% (or 19.67% error reduction) on English and 3.87% (or 19.4% error reduction) on Chinese in the second setting.

parallel data is mentioned in 25 sentences in this paper.

Topics mentioned in this paper:

2. Deciphering Foreign Language

Ravi, Sujith and Knight, Kevin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Of course, for many language pairs and domains, parallel data is not available.
Machine Translation as a Decipherment Task	We now turn to the problem of MT without parallel data .
Machine Translation as a Decipherment Task	Next, we present two novel decipherment approaches for MT training without parallel data .
Machine Translation as a Decipherment Task	Bayesian Decipherment: We introduce a novel method for estimating IBM Model 3 parameters without parallel data , using Bayesian learning.
Word Substitution Decipherment	Before we tackle machine translation without parallel data , we first solve a simpler problem—word substitution decipherment.

parallel data is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

3. Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections

Das, Dipanjan and Petrov, Slav

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	thank Amamag Subramanya for helping us with the implementation of label propagation and Shankar Kumar for access to the parallel data .
Experiments and Results	The parallel data came from the Europarl corpus (Koehn, 2005) and the ODS United Nations dataset (UN, 2006).
Experiments and Results	Taking the intersection of languages in these resources, and selecting languages with large amounts of parallel data , yields the following set of eight Indo-European languages: Danish, Dutch, German, Greek, Italian, Portuguese, Spanish and Swedish.
Experiments and Results	0 Projection: Our third baseline incorporates bilingual information by projecting POS tags directly across alignments in the parallel data .
Introduction	To bridge this gap, we consider a practically motivated scenario, in which we want to leverage existing resources from a resource-rich language (like English) when building tools for resource-poor foreign languages.1 We assume that absolutely no labeled training data is available for the foreign language of interest, but that we have access to parallel data with a resource-rich language.

parallel data is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

4. Collecting Highly Parallel Data for Paraphrase Evaluation

Chen, David and Dolan, William

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	We introduced a data collection framework that produces highly parallel data by asking different annotators to describe the same video segments.
Discussions and Future Work	While our data collection framework yields useful parallel data , it also has some limitations.
Discussions and Future Work	By pairing up descriptions of the same video in different languages, we obtain parallel data without requiring any bilingual skills.
Experiments	We quantified the utility of our highly parallel data by computing the correlation between BLEU and human ratings when different numbers of references were available.

parallel data is mentioned in 4 sentences in this paper.

Topics mentioned in this paper: