SciSurf: Index of "sentence pairs" in Proc. ACL 2011

Index of papers in Proc. ACL 2011 that mention

sentence pairs

Seen in text as:

sentence pairs (16)
sentence pair (15)

Seen in 31 sentences in 5 papers.

1. Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Joint Model with Unlabeled Parallel Text	the sentence pairs may be noisily parallel (or even comparable) instead of fully parallel (Munteanu and Marcu, 2005).
A Joint Model with Unlabeled Parallel Text	In such noisy cases, the labels (positive or negative) could be different for the two monolingual sentences in a sentence pair .
A Joint Model with Unlabeled Parallel Text	Although we do not know the exact probability that a sentence pair exhibits the same label, we can approximate it using their translation
Experimental Setup 4.1 Data Sets and Preprocessing	Because sentence pairs in the ISI corpus are quite noisy, we rely on Giza++ (Och and Ney, 2003) to obtain a new translation probability for each sentence pair , and select the 100,000 pairs with the highest translation probabilities.5
Experimental Setup 4.1 Data Sets and Preprocessing	We then classify each unlabeled sentence pair by combining the two sentences in each pair into one.
Experimental Setup 4.1 Data Sets and Preprocessing	5We removed sentence pairs with an original confidence score (given in the corpus) smaller than 0.98, and also removed the pairs that are too long (more than 60 characters in one sentence) to facilitate Giza++.
Results and Analysis	Preliminary experiments showed that Equation 5 does not significantly improve the performance in our case, which is reasonable since we choose only sentence pairs with the highest translation probabilities to be our unlabeled data (see Section 4.1).
Results and Analysis	However, even with only 2,000 unlabeled sentence pairs , the proposed approach still produces large performance gains.
Results and Analysis	Examination of those sentence pairs in setting 2 for which the two monolingual models still

sentence pairs is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

2. Model-Based Aligner Combination Using Dual Decomposition

DeNero, John and Macherey, Klaus

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Word alignment is the task of identifying corresponding words in sentence pairs .
Model Definition	Our bidirectional model Q = (12,13) is a globally normalized, undirected graphical model of the word alignment for a fixed sentence pair (6, f Each vertex in the vertex set V corresponds to a model variable Vi, and each undirected edge in the edge set D corresponds to a pair of variables (W, Each vertex has an associated potential function w, that assigns a real-valued potential to each possible value v,- of 16.1 Likewise, each edge has an associated potential function gig-(vi, 213-) that scores pairs of values.
Model Definition	The highest probability word alignment vector under the model for a given sentence pair (6, f) can be computed exactly using the standard Viterbi algorithm for HMMs in O(\|e\|2 - time.
Model Definition	Figure l: The structure of our graphical model for a simple sentence pair .
Model Inference	Moreover, the value of u is specific to a sentence pair .
Model Inference	Memory requirements are virtually identical to the baseline: only 11 must be stored for each sentence pair as it is being processed, but can then be immediately discarded once alignments are inferred.

sentence pairs is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

3. A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

Zollmann, Andreas and Vogel, Stephan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The parallel training data comprises of 9.6M sentence pairs (206M Chinese and 228M English words).
Hard rule labeling from word classes	(2003) to provide us with a set of phrase pairs for each sentence pair in the training corpus, annotated with their respective start and end positions in the source and target sentences.
Hard rule labeling from word classes	Consider the target-tagged example sentence pair:
Hard rule labeling from word classes	Intuitively, the labeling of initial rules with tags marking the boundary of their target sides results in complex rules whose nonterminal occurrences impose weak syntactic constraints on the rules eligible for substitution in a PSCFG derivation: The left and right boundary word tags of the inserted rule’s target side have to match the respective boundary word tags of the phrase pair that was replaced by a nonterminal when the complex rule was created from a training sentence pair .

sentence pairs is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

4. Learning Hierarchical Translation Structure with Linguistic Annotations

Mylonakis, Markos and Sima'an, Khalil

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For all language pairs we employ 200K and 400K sentence pairs for training, 2K for development and 2K for testing (single reference per source sentence).
Experiments	Table 1 presents the results for the baseline and our method for the 4 language pairs, for training sets of both 200K and 400K sentence pairs .
Experiments	In addition, increasing the size of the training data from 200K to 400K sentence pairs widens the performance margin between the baseline and our system, in some cases considerably.

sentence pairs is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

5. Deciphering Foreign Language

Ravi, Sujith and Knight, Kevin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Starting with the classic IBM work (Brown et al., 1993), training has been viewed as a maximization problem involving hidden word alignments (a) that are assumed to underlie observed sentence pairs
Machine Translation as a Decipherment Task	(1993) provide an efficient algorithm for training IBM Model 3 translation model when parallel sentence pairs are available.
Machine Translation as a Decipherment Task	We see that deciphering with 10k monolingual Spanish sentences yields the same performance as training with around 200-500 parallel English/Spanish sentence pairs .

sentence pairs is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: