Index of papers in Proc. ACL 2014 that mention
  • phrase pair
Hashimoto, Chikara and Torisawa, Kentaro and Kloetzer, Julien and Sano, Motoki and Varga, István and Oh, Jong-Hoon and Kidawara, Yutaka
Event Causality Extraction Method
An event causality candidate is given a causality score 0 8 core, which is the SVM score (distance from the hyperplane) that is normalized to [0,1] by the sigmoid function Each event causality candidate may be given multiple original sentences, since a phrase pair can appear in multiple sentences, in which case it is given more than one SVM score.
Experiments
These three datasets have no overlap in terms of phrase pairs .
Experiments
We observed that CEAsup and CEAWS performed poorly and tended to favor event causality candidates whose phrase pairs were highly relevant to each other but described the contrasts of events rather than event causality (e. g. build a slow muscle and build a fast muscle) probably because their
Experiments
phrase pairs described two events that often happen in parallel but are not event causality (e. g. reduce the intake of energy and increase the energy consumption) in the highly ranked event causality candidates of Csuns and Cssup.
Future Scenario Generation Method
A naive approach chains two phrase pairs by exact matching.
Future Scenario Generation Method
Scenarios (scs) generated by chaining causally-compatible phrase pairs are scored by Score(sc), which embodies our assumption that an acceptable scenario consists of plausible event causality pairs:
Introduction
Annotators regarded as event causality only phrase pairs that were interpretable as event causality without contexts (i.e., self-contained).
phrase pair is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming
Abstract
A semi-supervised training approach is proposed to train the parameters, and the phrase pair embedding is explored to model translation confidence directly.
Introduction
(2013) use recursive auto encoders to make full use of the entire merging phrase pairs , going beyond the boundary words with a maximum entropy classifier (Xiong et al., 2006).
Introduction
So as to model the translation confidence for a translation phrase pair, we initialize the phrase pair embedding by leveraging the sparse features and recurrent neural network.
Introduction
The sparse features are phrase pairs in translation table, and recurrent neural network is utilized to learn a smoothed translation score with the source and target side information.
Our Model
We then check whether translation candidates can be found in the translation table for each span, together with the phrase pair embedding and recurrent input vector (global features).
Our Model
We extract phrase pairs using the conventional method (Och and Ney, 2004).
Our Model
0 Representations of phrase pairs are automatically learnt to optimize the translation performance, while features used in conventional model are handcrafted.
Related Work
Given the representations of the smaller phrase pairs, recursive auto-encoder can generate the representation of the parent phrase pair with a reordering confidence score.
phrase pair is mentioned in 36 sentences in this paper.
Topics mentioned in this paper:
Lu, Shixiang and Chen, Zhenbiao and Xu, Bo
Input Features for DNN Feature Learning
Following (Maskey and Zhou, 2012), we use the following 4 phrase features of each phrase pair (Koehn et al., 2003) in the phrase table as the first type of input features, bidirectional phrase translation probability (P (6| f) and P (f |e)), bidirectional lexical weighting (Lem(e|f) and Lex(f|e)),
Input Features for DNN Feature Learning
3.2 Phrase pair similarity
Input Features for DNN Feature Learning
(2004) proposed a way of using term weight based models in a vector space as additional evidences for phrase pair translation quality.
Introduction
First, the input original features for the DBN feature learning are too simple, the limited 4 phrase features of each phrase pair , such as bidirectional phrase translation probability and bidirectional lexical weighting (Koehn et al., 2003), which are a bottleneck for learning effective feature representation.
Introduction
To address the first shortcoming, we adapt and extend some simple but effective phrase features as the input features for new DNN feature leam-ing, and these features have been shown significant improvement for SMT, such as, phrase pair similarity (Zhao et al., 2004), phrase frequency, phrase length (Hopkins and May, 2011), and phrase generative probability (Foster et al., 2010), which also show further improvement for new phrase feature learning in our experiments.
Semi-Supervised Deep Auto-encoder Features Learning for SMT
To speedup the pre-training, we subdivide the entire phrase pairs (with features X) in the phrase table into small mini-batches, each containing 100 cases, and update the weights after each mini-batch.
Semi-Supervised Deep Auto-encoder Features Learning for SMT
Each layer is greedily pre-trained for 50 epochs through the entire phrase pairs .
Semi-Supervised Deep Auto-encoder Features Learning for SMT
After the pre-training, for each phrase pair in the phrase table, we generate the DBN features (Maskey and Zhou, 2012) by passing the original phrase features X through the DBN using forward computation.
phrase pair is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Zhang, Jiajun and Liu, Shujie and Li, Mu and Zhou, Ming and Zong, Chengqing
Bilingually-constrained Recursive Auto-encoders
We can make inference from this fact that if a model can learn the same embedding for any phrase pair sharing the same meaning, the learned embedding must encode the semantics of the phrases and the corresponding model is our desire.
Bilingually-constrained Recursive Auto-encoders
For a phrase pair (3, If), two kinds of errors are involved:
Bilingually-constrained Recursive Auto-encoders
For the phrase pair (3, t), the joint error is:
Experiments
To obtain high-quality bilingual phrase pairs to train our BRAE model, we perform forced decoding for the bilingual training sentences and collect the phrase pairs used.
Experiments
After removing the duplicates, the remaining 1.12M bilingual phrase pairs (length ranging from 1 to 7) are obtained.
Experiments
These algorithms are based on corpus statistics including co-occurrence statistics, phrase pair usage and composition information.
Introduction
In decoding with phrasal semantic similarities, we apply the semantic similarities of the phrase pairs as new features during decoding to guide translation can-
phrase pair is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Saluja, Avneesh and Hassan, Hany and Toutanova, Kristina and Quirk, Chris
Generation & Propagation
In order to utilize these newly acquired phrase pairs , we need to compute their relevant features.
Generation & Propagation
The phrase pairs have four log-probability features with two likelihood features and two lexical weighting features.
Generation & Propagation
In addition, we use a sophisticated lexicalized hierarchical reordering model (HRM) (Galley and Manning, 2008) with five features for each phrase pair .
Related Work
The operational scope of their approach is limited in that they assume a scenario where unknown phrase pairs are provided (thereby sidestepping the issue of translation candidate generation for completely unknown phrases), and what remains is the estimation of phrasal probabilities.
Related Work
In our case, we obtain the phrase pairs from the graph structure (and therefore indirectly from the monolingual data) and a separate generation step, which plays an important role in good performance of the method.
phrase pair is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Berant, Jonathan and Liang, Percy
Introduction
We use two complementary paraphrase models: an association model based on aligned phrase pairs extracted from a monolingual parallel corpus, and a vector space model, which represents each utterance as a vector and learns a similarity score between them.
Paraphrasing
We define associations in cc and c primarily by looking up phrase pairs in a phrase table constructed using the PARALEX corpus (Fader et al., 2013).
Paraphrasing
We use the word alignments to construct a phrase table by applying the consistent phrase pair heuristic (Och and Ney, 2004) to all 5-grams.
Paraphrasing
This results in a phrase table with approximately 1.3 million phrase pairs .
phrase pair is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Hu, Yuening and Zhai, Ke and Eidelman, Vladimir and Boyd-Graber, Jordan
Topic Models for Machine Translation
Lexical Weighting In phrase-based SMT, lexical weighting features estimate the phrase pair quality by combining lexical translation probabilities of words in a phrase (Koehn et al., 2003).
Topic Models for Machine Translation
The phrase pair probabilities pw (6| f) are the normalized product of lexical probabilities of the aligned word pairs within that phrase pair (Koehn et al., 2003).
Topic Models for Machine Translation
from which we can compute the phrase pair probabilities pw (6| f; k) by multiplying the lexical probabilities and normalizing as in Koehn et a1.
phrase pair is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Chang, Yin-Wen and Rush, Alexander M. and DeNero, John and Collins, Michael
Experiments
alignment phrase pair
Experiments
Table 2: Alignment accuracy and phrase pair extraction accuracy for directional and bidirectional models.
Experiments
AER is alignment error rate and F l is the phrase pair extraction F1 score.
phrase pair is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Shi, Xing and Knight, Kevin and Ji, Heng
Training
Second, we extract phoneme phrase pairs consistent with these alignments.
Training
From the example above, we pull out phrase pairs like:
Training
We add these phrase pairs to FST B, and call this the phoneme-phrase-based model.
phrase pair is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: