Index of papers in Proc. ACL that mention

log-linear model

Seen in text as:

log-linear model (140)
log-linear models (25)
log-linear model: (3)
Log-linear models (3)

Seen in 170 sentences in 29 papers.

1. Pivot Approach for Extracting Paraphrase Patterns from Bilingual Corpora

Zhao, Shiqi and Wang, Haifeng and Liu, Ting and Li, Sheng

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We propose a log-linear model to compute the paraphrase likelihood of two patterns and exploit feature functions based on maximum likelihood estimation (MLE) and lexical weighting (LW).
Conclusion	We use a log-linear model to compute the paraphrase likelihood and exploit feature functions based on MLE and LW.
Conclusion	In addition, the log-linear model with the proposed feature functions significantly outperforms the conventional models.
Experiments	4.1 Evaluation of the Log-linear Model
Experiments	As previously mentioned, in the log-linear model of this paper, we use both MLE based and LW based feature functions.
Experiments	In this section, we evaluate the log-linear model (LL-Model) and compare it with the MLE based model (MLE-Model) presented by Bannard and Callison-Burch (2005)6.
Introduction	parsing and English-foreign language word alignment, (2) aligned patterns induction, which produces English patterns along with the aligned pivot patterns in the foreign language, (3) paraphrase patterns extraction, in which paraphrase patterns are extracted based on a log-linear model .
Introduction	Secondly, we propose a log-linear model for computing the paraphrase likelihood.
Introduction	Besides, the log-linear model is more effective than the conventional model presented in (Bannard and Callison-Burch, 2005).
Proposed Method	In order to exploit more and richer information to estimate the paraphrase likelihood, we propose a log-linear model:
Proposed Method	In this paper, 4 feature functions are used in our log-linear model , which include:

log-linear model is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

2. Additive Neural Networks for Statistical Machine Translation

liu, lemao and Watanabe, Taro and Sumita, Eiichiro and Zhao, Tiejun

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Although the log-linear model achieves success in SMT, it still suffers from some limitations: (1) the features are required to be linear with respect to the model itself; (2) features cannot be further interpreted to reach their potential.
Introduction	Recently, great progress has been achieved in SMT, especially since Och and Ney (2002) proposed the log-linear model: almost all the state-of-the-art SMT systems are based on the log-linear model .
Introduction	Regardless of how successful the log-linear model is in SMT, it still has some shortcomings.
Introduction	Compared with the log-linear model , it has more powerful expressive abilities and can deeply interpret and represent features with hidden units in neural networks.

log-linear model is mentioned in 27 sentences in this paper.

Topics mentioned in this paper:

3. Dependency Based Chinese Sentence Realization

He, Wei and Wang, Haifeng and Guo, Yuqing and Liu, Ting

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This paper describes log-linear models for a general-purpose sentence realizer based on dependency structures.
Abstract	Then the best linearizations compatible with the relative order are selected by log-linear models .
Abstract	The log-linear models incorporate three types of feature functions, including dependency relations, surface words and headwords.
Introduction	The other is log-linear model with different syntactic and semantic features (Velldal and Oepen, 2005; Nakanishi et al., 2005; Cahill et al., 2007).
Introduction	Compared with n-gram model, log-linear model is more powerful in that it is easy to integrate a variety of features, and to tune feature weights to maximize the probability.
Introduction	This paper presents a general-purpose realizer based on log-linear models for directly linearizing dependency relations given dependency structures.
Log-linear Models	We use log-linear models for selecting the sequence with the highest probability from all the possible linearizations of a subtree.
Log-linear Models	4.1 The Log-linear Model
Log-linear Models	Log-linear models employ a set of feature functions to describe properties of the data, and a set of learned weights to determine the contribution of each feature.

log-linear model is mentioned in 21 sentences in this paper.

Topics mentioned in this paper:

4. Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty

Tsuruoka, Yoshimasa and Tsujii, Jun'ichi and Ananiadou, Sophia

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Experimental results demonstrate that our method can produce compact and accurate models much more quickly than a state-of-the-art quasi-Newton method for Ll-regularized log-linear models .
Introduction	Log-linear models (a.k.a maximum entropy models) are one of the most widely-used probabilistic models in the field of natural language processing (NLP).
Introduction	Log-linear models have a major advantage over other
Introduction	Kazama and Tsujii (2003) describe a method for training a Ll-regularized log-linear model with a bound constrained version of the BFGS algorithm (Nocedal, 1980).
Log-Linear Models	In this section, we briefly describe log-linear models used in NLP tasks and L1 regularization.
Log-Linear Models	A log-linear model defines the following probabilistic distribution over possible structure y for input x:
Log-Linear Models	The weights of the features in a log-linear model are optimized in such a way that they maximize the regularized conditional log-likelihood of the training data:

log-linear model is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

5. Semantic Frame Identification with Distributed Word Representations

Hermann, Karl Moritz and Das, Dipanjan and Weston, Jason and Ganchev, Kuzman

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Argument Identification	where p9 is a log-linear model normalized over the set Ry, with features described in Table 1.
Argument Identification	Inference Although our learning mechanism uses a local log-linear model , we perform inference globally on a per-frame basis by applying hard structural constraints.
Discussion	However, since the input representation is shared across all frames, every other training example from all the lexical units affects the optimal estimate, since they all modify the joint parameter matrix M. By contrast, in the log-linear models each label has its own set of parameters, and they interact only via the normalization constant.
Discussion	They also use a log-linear model , but they incorporate a latent variable that uses WordNet (Fellbaum, 1998) to get lexical-semantic relationships and smooths over frames for ambiguous lexical units.
Discussion	Another difference is that when training the log-linear model , they normalize over all frames, while we normalize over the allowed frames for the current lexical unit.
Experiments	The baselines use a log-linear model that models the following probability at training time:
Experiments	For comparison with our model from §3, which we call WSABIE EMBEDDING, we implemented two baselines with the log-linear model .
Experiments	So the second baseline has the same input representation as WSABIE EMBEDDING but uses a log-linear model instead of WSABIE.

log-linear model is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

6. Phrase Table Training for Precision and Recall: What Makes a Good Phrase and a Good Phrase Pair?

Deng, Yonggang and Xu, Jia and Gao, Yuqing

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Generic Phrase Training Procedure	Note that under the log-linear model , applying threshold for filtering is equivalent to comparing the “likelihood” ratio.
Conclusions	In this paper, the problem of extracting phrase translation is formulated as an information retrieval process implemented with a log-linear model aiming for a balanced precision and recall.
Discussions	The generic phrase training algorithm follows an information retrieval perspective as in (Venugopal et al., 2003) but aims to improve both precision and recall with the trainable log-linear model .
Discussions	Under the general framework, one can put as many features as possible together under the log-linear model to evaluate the quality of a phrase and a phase pair.
Experimental Results	Our decoder is a phrase-based multi-stack implementation of the log-linear model similar to Pharaoh (Koehn et al., 2003).
Experimental Results	Like other log-linear model based decoders, active features in our translation engine include translation models in two directions, lexicon weights in two directions, language model, lexicalized distortion models, sentence length penalty and other heuristics.
Experimental Results	Since the translation engine implements a log-linear model , the discriminative training of feature weights in the decoder should be embedded in the whole end-to-end system jointly with the discriminative phrase table training process.

log-linear model is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

7. Incorporating Information Status into Generation Ranking

Cahill, Aoife and Riester, Arndt

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We build a log-linear model that incorporates these asymmetries for ranking German string reali-sations from input LFG F-structures.
Generation Ranking	(2007), a log-linear model based on the Lexical Functional Grammar (LFG) Framework (Kaplan and Bresnan, 1982).
Generation Ranking	(2007) describe a log-linear model that uses linguistically motivated features and improves over a simple trigram language model baseline.
Generation Ranking	We take this log-linear model as our starting point.3
Generation Ranking Experiments	We tune the parameters of the log-linear model on a small development set of 63 sentences, and carry out the final evaluation on 261 unseen sentences.
Generation Ranking Experiments	We evaluate the string chosen by the log-linear model against the original treebank string in terms of exact match and BLEU score (Papineni et al.,

log-linear model is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

8. A Discriminative Latent Variable Model for Statistical Machine Translation

Blunsom, Phil and Cohn, Trevor and Osborne, Miles

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discriminative Synchronous Transduction	3.1 A global log-linear model
Discriminative Synchronous Transduction	Our findings echo those observed for latent variable log-linear models successfully used in monolingual parsing (Clark and Curran, 2007; Petrov et al., 2007).
Discriminative Synchronous Transduction	This method has been demonstrated to be effective for (non-convex) log-linear models with latent variables (Clark and Curran, 2004; Petrov et al., 2007).
Introduction	First, we develop a log-linear model of translation which is globally trained on a significant number of parallel sentences.

log-linear model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

9. Paraphrase Identification as Probabilistic Quasi-Synchronous Recognition

Das, Dipanjan and Smith, Noah A.

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Product of Experts	For these bigram or trigram overlap features, a similar log-linear model has to be normalized with a partition function, which considers the (unnormalized) scores of all possible target sentences, given the source sentence.
QG for Paraphrase Modeling	5 We use log-linear models three times: for the configuration, the lexical semantics class, and the word.
QG for Paraphrase Modeling	(2007),6 we employ a 14-feature log-linear model over all logically possible combinations of the 14 WordNet relations (Miller, 1995).7 Similarly to Eq.
QG for Paraphrase Modeling	14, we normalize this log-linear model based on the set of relations that are nonempty in WordNet for the word 3360-).

log-linear model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

10. A Recursive Recurrent Neural Network for Statistical Machine Translation

Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Word embedding is used as the input to learn translation confidence score, which is combined with commonly used features in the conventional log-linear model .
Our Model	The difference between our model and the conventional log-linear model includes:
Phrase Pair Embedding	Instead of integrating the sparse features directly into the log-linear model , we use them as the input to learn a phrase pair embedding.
Phrase Pair Embedding	To train the neural network, we add the confidence scores to the conventional log-linear model as features.
Related Work	Together with other commonly used features, the translation confidence score is integrated into a conventional log-linear model .

log-linear model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

11. Transfer Learning for Constituency-Based Grammars

Zhang, Yuan and Barzilay, Regina and Globerson, Amir

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Joint Model for Two Formalisms	Instead, we assume that the distribution over yCFG is a log-linear model with parameters 601:0 (i.e., a sub-vector of 6) , namely:
Evaluation Setup	In this setup, the model reduces to a normal log-linear model for the target formalism.
Experiment and Analysis	It’s not surprising that Cahill’s model outperforms our log-linear model because it relies heavily on handcrafted rules optimized for the dataset.
Features	Feature functions in log-linear models are designed to capture the characteristics of each derivation in the tree.

log-linear model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

CCG (45)
Treebank (27)
Penn Treebank (20)

12. Mixing Multiple Translation Models in Statistical Machine Translation

Razmara, Majid and Foster, George and Sankaran, Baskaran and Sarkar, Anoop

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Baselines	where m ranges over IN and OUT, pm(é\| f) is an estimate from a component phrase table, and each Am is a weight in the top-level log-linear model , set so as to maximize dev-set BLEU using minimum error rate training (Och, 2003).
Ensemble Decoding	In the typical log-linear model SMT, the posterior
Ensemble Decoding	Since in log-linear models , the model scores are not normalized to form probability distributions, the scores that different models assign to each phrase-pair may not be in the same scale.
Experiments & Results 4.1 Experimental Setup	It was filtered to retain the top 20 translations for each source phrase using the TM part of the current log-linear model .

log-linear model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

13. Learning Translation Consensus with Structured Label Propagation

Liu, Shujie and Li, Chi-Ho and Li, Mu and Zhou, Ming

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion and Future Work	The consensus statistics are integrated into the conventional log-linear model as features.
Experiments and Results	Instead of using graph-based consensus confidence as features in the log-linear model , we perform structured label propagation (Struct-LP) to re-rank the n-best list directly, and the similarity measures for source sentences and translation candidates are symmetrical sentence level BLEU (equation (10)).
Features and Training	Therefore, we can alternatively update graph-based consensus features and feature weights in the log-linear model .
Graph-based Translation Consensus	Our MT system with graph-based translation consensus adopts the conventional log-linear model .

log-linear model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

14. Bootstrapping a Unified Model of Lexical and Phonetic Acquisition

Elsner, Micha and Goldwater, Sharon and Eisenstein, Jacob

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present a Bayesian model that clusters together phonetic variants of the same lexical item while learning both a language model over lexical items and a log-linear model of pronunciation variability based on articulatory features.
Introduction	Our model is conceptually similar to those used in speech recognition and other applications: we assume the intended tokens are generated from a bigram language model and then distorted by a noisy channel, in particular a log-linear model of phonetic variability.
Lexical-phonetic model	(2008), we parameterize these distributions with a log-linear model .
Lexical-phonetic model	In modern phonetics and phonology, these generalizations are usually expressed as Optimality Theory constraints; log-linear models such as ours have previously been used to implement stochas-

log-linear model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

15. Enhancing Grammatical Cohesion: Generating Transitional Expressions for SMT

Tu, Mei and Zhou, Yu and Zong, Chengqing

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A semantic span can include one or more eus.	Following Och and Ney (2002), our model is framed as a log-linear model:
A semantic span can include one or more eus.	courage the decoder to generate transitional words and phrases; the score is utilized as an additional feature hk (es, ft) in the log-linear model .
A semantic span can include one or more eus.	In general, according to formula (3), the translation quality based on the log-linear model is related tightly with the features chosen.
Conclusion	Our contributions can be summarized as: l) the new translation rules are more discriminative and sensitive to cohesive information by converting the source string into a CSS-based tagged-flattened string; 2) the new additional features embedded in the log-linear model can encourage the decoder to produce transitional expressions.

log-linear model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

16. Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machine Translation

Lu, Shixiang and Chen, Zhenbiao and Xu, Bo

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related Work	(2013) went beyond the log-linear model for SMT and proposed a novel additive neural networks based translation model, which overcome some of the shortcomings suffered by the log-linear model : linearity and the lack of deep interpretation and representation in features.
Semi-Supervised Deep Auto-encoder Features Learning for SMT	Each translation rule in the phrase-based translation model has a set number of features that are combined in the log-linear model (Och and Ney, 2002), and our semi-supervised DAE features can also be combined in this model.
Semi-Supervised Deep Auto-encoder Features Learning for SMT	To combine these learned features (DBN and DAB feature) into the log-linear model , we need to eliminate the impact of the nonlinear learning mechanism.

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

17. Lattice Desegmentation for Statistical Machine Translation

Salameh, Mohammad and Cherry, Colin and Kondrak, Grzegorz

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	The decoder’s log-linear model includes a standard feature set.
Experimental Setup	The decoder’s log-linear model is tuned with MERT (Och, 2003).
Experimental Setup	Both the decoder’s log-linear model and the re-ranking models are trained on the same development set.

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

LM (16)
language model (13)
BLEU (13)

18. Learning Topic Representation for SMT with Neural Networks

Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We evaluate the performance of adding new topic-related features to the log-linear model and compare the translation accuracy with the method in (Xiao et al., 2012).
Introduction	We integrate topic similarity features in the log-linear model and evaluate the performance on the NIST Chinese-to-English translation task.
Topic Similarity Model with Neural Network	The similarity scores are integrated into the standard log-linear model for making translation decisions.

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

19. Lexical Inference over Multi-Word Predicates: A Distributional Approach

Abend, Omri and Cohen, Shay B. and Steedman, Mark

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Our Proposal: A Latent LC Approach	We address the task with a latent variable log-linear model , representing the LCs of the predicates.
Our Proposal: A Latent LC Approach	The introduction of latent variables into the log-linear model leads to a non-convex objective function.
Our Proposal: A Latent LC Approach	Once h has been fixed, the model collapses to a convex log-linear model .

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

20. Inferring User Political Preferences from Streaming Communications

Volkova, Svitlana and Coppersmith, Glen and Van Durme, Benjamin

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Batch Models	7We use log-linear models over reasonable alternatives such as perceptron or SVM, following the practice of a wide range of previous work in related areas (Smith, 2004; Liu et a1., 2005; Poon et a1., 2009) including text classification in social media (Van Durme, 2012b; Yang and Eisenstein, 2013).
Batch Models	The corresponding log-linear model is defined as:
Experimental Setup	We experiment with log-linear models defined in Eq.

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

21. Joint Word Alignment and Bilingual Named Entity Recognition Using Dual Decomposition

Wang, Mengqiu and Che, Wanxiang and Manning, Christopher D.

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	But instead of using just the PMI scores of bilingual NE pairs, as in our work, they employed a feature-rich log-linear model to capture bilingual correlations.
Experimental Setup	Parameters in their log-linear model require training with bilingually annotated data, which is not readily available.
Related Work	(2010a) presented a supervised learning method for performing joint parsing and word alignment using log-linear models over parse trees and an ITG model over alignment.

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

NER (41)
word alignment (20)
CRF (15)

22. Lightly Supervised Learning of Procedural Dialog Systems

Volkova, Svitlana and Choudhury, Pallavi and Quirk, Chris and Dolan, Bill and Zettlemoyer, Luke

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Building Dialog Trees from Instructions	Given a single instruction 2' with category au, we use a log-linear model to represent the distri-
Understanding Initial Queries	We employ a log-linear model and try to maximize initial dialog state distribution over the space of all nodes in a dialog network:
Understanding Query Refinements	Dialog State Update Model We use a log-linear model to maximize a dialog state distribution over the space of all nodes in a dialog network:

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

log-linear (3)
log-linear model (3)

23. Decentralized Entity-Level Modeling for Coreference Resolution

Durrett, Greg and Hall, David and Klein, Dan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Inference	We then report the corresponding chains 0(a) as the system output.3 For learning, the gradient takes the standard form of the gradient of a log-linear model , a difference of expected feature counts under the gold annotation and under no annotation.
Introduction	We use a log-linear model that can be expressed as a factor graph.
Models	The final log-linear model is given by the following formula:

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

24. Semi-Supervised Semantic Tagging of Conversational Understanding using Markov Topic Regression

Celikyilmaz, Asli and Hakkani-Tur, Dilek and Tur, Gokhan and Sarikaya, Ruhi

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Markov Topic Regression - MTR	log-linear models with parameters, AiméRM , is
Markov Topic Regression - MTR	labeled data, 712?, based on the log-linear model in Eq.
Semi-Supervised Semantic Labeling	The a: is used as the input matrix of the kth log-linear model (corresponding to kth semantic tag (topic)) to infer the [3 hyper-parameter of MTR in Eq.

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

25. Maximum Expected BLEU Training of Phrase and Lexicon Translation Models

He, Xiaodong and Deng, Li

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Och (2003) proposed using a log-linear model to incorporate multiple features for translation, and proposed a minimum error rate training (MERT) method to train the feature weights to optimize a desirable translation metric.
Abstract	While the log-linear model itself is discriminative, the phrase and lexicon translation features, which are among the most important components of SMT, are derived from either generative models or heuristics (Koehn et al., 2003, Brown et al., 1993).
Abstract	In that work, multiple features, most of them are derived from generative models, are incorporated into a log-linear model , and the relative weights of them are tuned discriminatively on a small tuning set.

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

26. Boosting-Based System Combination for Machine Translation

Xiao, Tong and Zhu, Jingbo and Zhu, Muhua and Wang, Huizhen

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	where Pr(e\| f) is the probability that e is the translation of the given source string f. To model the posterior probability Pr(e\| f) , most of the state-of-the-art SMT systems utilize the log-linear model proposed by Och and Ney (2002), as follows,
Background	In this paper, u denotes a log-linear model that has Mfixed features {h1(f,e), ..., hM(f,e)}, ,1 = {3.1, ..., AM} denotes the M parameters of u, and u(/1) denotes a SMT system based on u with parameters ,1.
Background	In this paper, we use the term training set to emphasize the training of log-linear model .

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

27. Collaborative Decoding: Partial Hypothesis Re-ranking Using Translation Consensus between Decoders

Li, Mu and Duan, Nan and Zhang, Dongdong and Li, Chi-Ho and Zhou, Ming

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Collaborative Decoding	The requirement for a log-linear model aims to provide a natural way to integrate the new co-decoding features.
Collaborative Decoding	Referring to the log-linear model formulation, the translation posterior P(e'\|dk) can be computed as:
Conclusion	In this paper, we present a framework of collaborative decoding, in which multiple MT decoders are coordinated to search for better translations by re-ranking partial hypotheses using augmented log-linear models with translation consensus -based features.

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

28. Fast Consensus Decoding over Translation Forests

DeNero, John and Chiang, David and Knight, Kevin

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Consensus Decoding Algorithms	The distribution P(e\| f) can be induced from a translation system’s features and weights by expo-nentiating with base I) to form a log-linear model:
Experimental Results	The log-linear model weights were trained using MIRA, a margin-based optimization procedure that accommodates many features (Crammer and Singer, 2003; Chiang et al., 2008).
Experimental Results	We tuned b, the base of the log-linear model , to optimize consensus decoding performance.

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

29. Cohesive Phrase-Based Decoding for Statistical Machine Translation

Cherry, Colin

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Cohesive Decoding	This count becomes a feature in the decoder’s log-linear model , the weight of which is trained with MERT.
Experiments	Weights for the log-linear model are set using MERT, as implemented by Venugopal and Vogel (2005).
Experiments	Since adding features to the decoder’s log-linear model is straightforward, we also experiment with a combined system that uses both the cohesion constraint and a lexical reordering model.

log-linear model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

subtrees (11)
phrase-based (10)
BLEU (9)