Index of papers in Proc. ACL 2009 that mention
  • log-linear
He, Wei and Wang, Haifeng and Guo, Yuqing and Liu, Ting
Abstract
This paper describes log-linear models for a general-purpose sentence realizer based on dependency structures.
Abstract
Then the best linearizations compatible with the relative order are selected by log-linear models.
Abstract
The log-linear models incorporate three types of feature functions, including dependency relations, surface words and headwords.
Introduction
The other is log-linear model with different syntactic and semantic features (Velldal and Oepen, 2005; Nakanishi et al., 2005; Cahill et al., 2007).
Introduction
Compared with n-gram model, log-linear model is more powerful in that it is easy to integrate a variety of features, and to tune feature weights to maximize the probability.
Introduction
This paper presents a general-purpose realizer based on log-linear models for directly linearizing dependency relations given dependency structures.
Log-linear Models
We use log-linear models for selecting the sequence with the highest probability from all the possible linearizations of a subtree.
Log-linear Models
4.1 The Log-linear Model
Log-linear Models
Log-linear models employ a set of feature functions to describe properties of the data, and a set of learned weights to determine the contribution of each feature.
log-linear is mentioned in 22 sentences in this paper.
Topics mentioned in this paper:
Tsuruoka, Yoshimasa and Tsujii, Jun'ichi and Ananiadou, Sophia
Abstract
Experimental results demonstrate that our method can produce compact and accurate models much more quickly than a state-of-the-art quasi-Newton method for Ll-regularized log-linear models.
Introduction
Log-linear models (a.k.a maximum entropy models) are one of the most widely-used probabilistic models in the field of natural language processing (NLP).
Introduction
Log-linear models have a major advantage over other
Introduction
Kazama and Tsujii (2003) describe a method for training a Ll-regularized log-linear model with a bound constrained version of the BFGS algorithm (Nocedal, 1980).
Log-Linear Models
In this section, we briefly describe log-linear models used in NLP tasks and L1 regularization.
Log-Linear Models
A log-linear model defines the following probabilistic distribution over possible structure y for input x:
Log-Linear Models
The weights of the features in a log-linear model are optimized in such a way that they maximize the regularized conditional log-likelihood of the training data:
log-linear is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Cahill, Aoife and Riester, Arndt
Abstract
We investigate the influence of information status (IS) on constituent order in German, and integrate our findings into a log-linear surface realisation ranking model.
Abstract
We build a log-linear model that incorporates these asymmetries for ranking German string reali-sations from input LFG F-structures.
Conclusions
By calculating strong asymmetries between pairs of IS labels, and establishing the most frequent syntactic characteristics of these asymmetries, we designed a new set of features for a log-linear ranking model.
Generation Ranking
(2007), a log-linear model based on the Lexical Functional Grammar (LFG) Framework (Kaplan and Bresnan, 1982).
Generation Ranking
(2007) describe a log-linear model that uses linguistically motivated features and improves over a simple trigram language model baseline.
Generation Ranking
We take this log-linear model as our starting point.3
Generation Ranking Experiments
These are all automatically removed from the list of features to give a total of 130 new features for the log-linear ranking model.
Generation Ranking Experiments
We train the log-linear ranking model on 7759 F-structures from the TIGER treebank.
Generation Ranking Experiments
We tune the parameters of the log-linear model on a small development set of 63 sentences, and carry out the final evaluation on 261 unseen sentences.
log-linear is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Das, Dipanjan and Smith, Noah A.
Product of Experts
These features have to be included in estimating pkn-d, which has log-linear component models (Eq.
Product of Experts
For these bigram or trigram overlap features, a similar log-linear model has to be normalized with a partition function, which considers the (unnormalized) scores of all possible target sentences, given the source sentence.
QG for Paraphrase Modeling
5 We use log-linear models three times: for the configuration, the lexical semantics class, and the word.
QG for Paraphrase Modeling
(2007),6 we employ a 14-feature log-linear model over all logically possible combinations of the 14 WordNet relations (Miller, 1995).7 Similarly to Eq.
QG for Paraphrase Modeling
14, we normalize this log-linear model based on the set of relations that are nonempty in WordNet for the word 3360-).
log-linear is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Branavan, S.R.K. and Chen, Harr and Zettlemoyer, Luke and Barzilay, Regina
A Log-Linear Model for Actions
Given a state 3 = (5, d, j, W), the space of possible next actions is defined by enumerating sub-spans of unused words in the current sentence (i.e., subspans of the jth sentence of d not in W), and the possible commands and parameters in environment state 5.4 We model the policy distribution p(a|s; 6) over this action space in a log-linear fashion (Della Pietra et al., 1997; Lafferty et al., 2001), giving us the flexibility to incorporate a diverse range of features.
Abstract
We use a policy gradient algorithm to estimate the parameters of a log-linear model for action selection.
Introduction
Our policy is modeled in a log-linear fashion, allowing us to incorporate features of both the instruction text and the environment.
Reinforcement Learning
which is the derivative of a log-linear distribution.
log-linear is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Li, Mu and Duan, Nan and Zhang, Dongdong and Li, Chi-Ho and Zhou, Ming
Collaborative Decoding
In our work, any Maximum A Posteriori (MAP) SMT model with log-linear formulation (Och, 2002) can be a qualified candidate for a baseline model.
Collaborative Decoding
The requirement for a log-linear model aims to provide a natural way to integrate the new co-decoding features.
Collaborative Decoding
Referring to the log-linear model formulation, the translation posterior P(e'|dk) can be computed as:
Conclusion
In this paper, we present a framework of collaborative decoding, in which multiple MT decoders are coordinated to search for better translations by re-ranking partial hypotheses using augmented log-linear models with translation consensus -based features.
log-linear is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min and Aw, Aiti and Li, Haizhou
Analysis
We want to further study the happenings after we integrate the constraint feature (our SDB model and Marton and Resnik’s XP+) into the log-linear translation model.
Introduction
These constituent matching/violation counts are used as a feature in the decoder’s log-linear model and their weights are tuned via minimal error rate training (MERT) (Och, 2003).
Introduction
Similar to previous methods, our SDB model is integrated into the decoder’s log-linear model as a feature so that we can inherit the idea of soft constraints.
The Syntax-Driven Bracketing Model 3.1 The Model
new feature into the log-linear translation model: PSDB (b|T, This feature is computed by the SDB model described in equation (3) or equation (4), which estimates a probability that a source span is to be translated as a unit within particular syntactic contexts.
log-linear is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
DeNero, John and Chiang, David and Knight, Kevin
Consensus Decoding Algorithms
The distribution P(e| f) can be induced from a translation system’s features and weights by expo-nentiating with base I) to form a log-linear model:
Experimental Results
The log-linear model weights were trained using MIRA, a margin-based optimization procedure that accommodates many features (Crammer and Singer, 2003; Chiang et al., 2008).
Experimental Results
We tuned b, the base of the log-linear model, to optimize consensus decoding performance.
log-linear is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: