Index of papers in Proc. ACL 2012 that mention

Seen in text as:

Seen in 32 sentences in 4 papers.

Meng, Xinfan and Wei, Furu and Liu, Xiaohua and Zhou, Ming and Xu, Ge and Wang, Houfeng

Abstract	In this paper, we propose a generative cross-lingual mixture model (CLMM) to leverage unlabeled bilingual parallel data .
Abstract	By fitting parameters to maximize the likelihood of the bilingual parallel data, the proposed model learns previously unseen sentiment words from the large bilingual parallel data and improves vocabulary coverage significantly.
Experiment	CLMM includes two hyper-parameters (A3 and At) controlling the contribution of unlabeled parallel data .
Experiment	4.5 The Influence of Unlabeled Parallel Data
Experiment	We investigate how the size of the unlabeled parallel data affects the sentiment classification in this subsection.
Introduction	Instead of relying on the unreliable machine translated labeled data, CLMM leverages bilingual parallel data to bridge the language gap between the source language and the target language.
Introduction	CLMM is a generative model that treats the source language and target language words in parallel data as generated simultaneously by a set of mixture components.
Introduction	This paper makes two contributions: (1) we propose a model to effectively leverage large bilingual parallel data for improving vocabulary coverage; and (2) the proposed model is applicable in both settings of cross-lingual sentiment classification, irrespective of the availability of labeled data in the target language.

parallel data is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

Titov, Ivan and Klementiev, Alexandre

Abstract	We argue that multilingual parallel data provides a valuable source of indirect supervision for induction of shallow semantic representations.
Abstract	When applied to German—English parallel data , our method obtains a substantial improvement over a model trained without using the agreement signal, when both are tested on nonparallel sentences.
Conclusions	We show that an agreement signal extracted from parallel data provides indirect supervision capable of substantially improving a state-of-the-art model for semantic role induction.
Introduction	The goal of this work is to show that parallel data is useful in unsupervised induction of shallow semantic representations.
Multilingual Extension	As we argued in Section 1, our goal is to penalize for disagreement in semantic structures predicted for each language on parallel data .
Multilingual Extension	Intuitively, when two arguments are aligned in parallel data , we expect them to be labeled with the same semantic role in both languages.
Multilingual Extension	Specifically, we augment the joint probability with a penalty term computed on parallel data:

parallel data is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

Kolachina, Prasanth and Cancedda, Nicola and Dymetman, Marc and Venkatapathy, Sriram

Abstract	Parallel data in the domain of interest is the key resource when training a statistical machine translation (SMT) system for a specific purpose.
Inferring a learning curve from mostly monolingual data	However, when a configuration )f four initial points is used for the same amount of ‘seed” parallel data , it outperforms both the config-Jrations with three initial points.
Inferring a learning curve from mostly monolingual data	The ability to predict the amount of parallel data required to achieve a given level of quality is very valuable in planning business deployments of statistical machine translation; yet, we are not aware of any rigorous proposal for addressing this need.
Introduction	Parallel data in the domain of interest is the key resource when training a statistical machine translation (SMT) system for a specific business purpose.
Introduction	This prediction, or more generally the prediction of the learning curve of an SMT system as a function of available in-domain parallel data , is the objective of this paper.
Introduction	They show that without any parallel data we can predict the expected translation accuracy at 75K segments within an error of 6 BLEU points (Table 4), while using a seed training corpus of 10K segments narrows this error to within 1.5 points (Table 6).

parallel data is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

Nuhn, Malte and Mauser, Arne and Ney, Hermann

Experimental Evaluation	We also compare the results on these corpora to a system trained on parallel data .
Experimental Evaluation	Och (2002) reports results of 48.2 BLEU for a single-word based translation system and 56.1 BLEU using the alignment template approach, both trained on parallel data .
Related Work	Unsupervised training of statistical translations systems without parallel data and related problems have been addressed before.
Related Work	Close to the methods described in this work, Ravi and Knight (2011) treat training and translation without parallel data as a deciphering problem.
Related Work	They perform experiments on a SpanislflEnglish task with vocabulary sizes of about 500 words and achieve a performance of around 20 BLEU compared to 70 BLEU obtained by a system that was trained on parallel data .

parallel data is mentioned in 6 sentences in this paper.

Topics mentioned in this paper: