Index of papers in Proc. ACL that mention

objective function

Seen in text as:

objective function (262)
objective function: (13)
objective functions (8)
Objective Function (8)
Objective function (6)

Seen in 293 sentences in 61 papers.

1. Subtree Extractive Summarization via Submodular Maximization

Morita, Hajime and Sasano, Ryohei and Takamura, Hiroya and Okumura, Manabu

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Budgeted Submodular Maximization with Cost Function	The algorithm it-eratively adds to the current summary the element 3,- that has the largest ratio of the objective function gain to the additional cost, unless adding it violates the budget constraint.
Budgeted Submodular Maximization with Cost Function	After the loop, the algorithm compares Gi with the {3*} that has the largest value of the objective function among all subtrees that are under the budget, and it outputs the summary candidate with the largest value.
Joint Model of Extraction and Compression	4.1 Objective Function
Joint Model of Extraction and Compression	We designed our objective function by combining this relevance score with a penalty for redundancy and too-compressed sentences.
Joint Model of Extraction and Compression	The behavior can be represented by a submodular objective function that reduces word scores depending on those already included in the summary.

objective function is mentioned in 19 sentences in this paper.

Topics mentioned in this paper:

2. Summarization Through Submodularity and Dispersion

Dasgupta, Anirban and Kumar, Ravi and Ravi, Sujith

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	of our system that approximates the submodular objective function proposed by (Lin and Bilmes, 2011).7 As shown in the results, our best system8 which uses the hs dispersion function achieves a better ROUGE-1 F-score than all other systems.
Experiments	(3) We also analyze the contributions of individual components of the new objective function towards summarization performance by selectively setting certain parameters to 0.
Experiments	However, since individual components within our objective function are parametrized it is easy to tune them for a specific task or genre.
Framework	We start by describing a generic objective function that can be widely applied to several summarization scenarios.
Framework	This objective function is the sum of a monotone submodular coverage function and a non-submodular dispersion function.
Framework	We then describe a simple greedy algorithm for optimizing this objective function with provable approximation guarantees for three natural dispersion functions.
Using the Framework	generate a graph and instantiate our summarization objective function with specific components that capture the desiderata of a given summarization task.
Using the Framework	We model this property in our objective function as follows.
Using the Framework	We then add this component to our objective function as w(S) = Zues Mu)-

objective function is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

3. A Class of Submodular Functions for Document Summarization

Lin, Hui and Bilmes, Jeff

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	As shown in Table 1, optimizing this objective function gives a ROUGE-1 F—measure score 32.44%.
Experiments	Figure 1: ROUGE-1 F-measure scores on DUC—03 when oz and K vary in objective function 51(8) —\|— AR1(S),
Introduction	Of course, none of this is useful if the objective function .7: is inappropriate for the summarization task.
Monotone Submodular Objectives	Objective functions for extractive summarization usually measure these two separately and then mix them together trading off encouraging relevance and penalizing redundancy.
Monotone Submodular Objectives	The redundancy penalty usually violates the monotonicity of the objective functions (Carbonell and Goldstein, 1998; Lin and Bilmes, 2010).
Submodularity in Summarization	(1999) on the budgeted maximum cover problem to the general submodular framework, and show a practical greedy algorithm with a (1 — 1/ fi)-approximation factor, where each greedy step adds the unit with the largest ratio of objective function gain to scaled cost, while not violating the budget constraint (see (Lin and Bilmes, 2010) for details).
Submodularity in Summarization	In particular, Carbonell and Goldstein (1998) define an objective function gain of adding element k to set S (k ¢ 8) as:
Submodularity in Summarization	Although the authors may not have noticed, their objective functions are also submodular, adding more evidence suggesting that submodularity is natural for summarization tasks.

objective function is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

4. Grammatical Error Correction Using Integer Linear Programming

Wu, Yuanbin and Ng, Hwee Tou

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Inference with First Order Variables	Express the inference objective as a linear objective function ; and
Inference with First Order Variables	For the grammatical error correction task, the variables in ILP are indicators of the corrections that a word needs, the objective function measures how grammatical the whole sentence is if some corrections are accepted, and the constraints guarantee that the corrections do not conflict with each other.
Inference with First Order Variables	3.2 The Objective Function
Inference with Second Order Variables	(17) A new objective function combines the weights from both first and second order variables:
Introduction	Variables of ILP are indicators of possible grammatical error corrections, the objective function aims to select the best set of corrections, and the constraints help to enforce a valid and grammatical output.

objective function is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

5. Collective Generation of Natural Image Descriptions

Kuznetsova, Polina and Ordonez, Vicente and Berg, Alexander and Berg, Tamara and Choi, Yejin

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Image-level Content Planning	4.1 Variables and Objective Function The following set of indicator variables encodes the selection of objects and ordering:
Image-level Content Planning	The objective function , F, that we will maximize is a weighted linear combination of these indicator variables and can be optimized using integer linear programming:
Image-level Content Planning	We use IBM CPLEX to optimize this objective function subject to the constraints introduced next in §4.2.
Surface Realization	5.1 Variables and Objective Function The following set of variables encodes the selection of phrases and their ordering in constructing 5’ sentences.
Surface Realization	Finally, we define the objective function F as:
Surface Realization	the objective function (Eq.

objective function is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

6. Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization

Zhou, Guangyou and Liu, Fang and Liu, Yang and He, Shizhu and Zhao, Jun

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Our Approach	Wh611 noring the coupling between Vp, it can be solved by minimizing the objective function as follows:
Our Approach	Combining equations (1) and (2), we get the following objective function:
Our Approach	If we set a small value for Ap, the objective function behaves like the traditional NMF and the importance of data sparseness is emphasized; while a big value of Ap indicates Vp should be very closed to V1, and equation (3) aims to remove the noise introduced by statistical machine translation.

objective function is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

7. Phrase Clustering for Discriminative Learning

Lin, Dekang and Wu, Xiaoyun

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion and Related Work	Ando and Zhang (2005) defined an objective function that combines the original problem on the labeled data with a set of auxiliary problems on unlabeled data.
Discussion and Related Work	The combined objective function is then alternatingly optimized with the labeled and unlabeled data.
Introduction	The learning algorithm then optimizes a regularized, convex objective function that is expressed in terms of these features.
Introduction	distributed clustering algorithm with a similar objective function as the Brown algorithm.
Query Classification	We made a small modification to the objective function for logistic regression to take into account the prior distribution and to use 50% as a uniform decision boundary for all the classes.
Query Classification	When training the classifier for a class with [9 positive examples out of a total of n examples, we change the objective function to:
Query Classification	We suspect that such features make the optimization of the objective function much more difficult.

objective function is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

8. Hierarchical Joint Learning: Improving Joint Parsing and Named Entity Recognition with Non-Jointly Labeled Data

Finkel, Jenny Rose and Manning, Christopher D.

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	We are also interested in ways to modify the objective function to place more emphasis on learning a good joint model, instead of equally weighting the learning of the joint and single-task models.
Hierarchical Joint Learning	L-BFGS and gradient descent, two frequently used numerical optimization algorithms, require computing the value and partial derivatives of the objective function using the entire training set.
Hierarchical Joint Learning	It requires a stochastic objective function, which is meant to be a low computational cost estimate of the real objective function .
Hierarchical Joint Learning	In most NLP models, such as logistic regression with a Gaussian prior, computing the stochastic objective function is fairly straightforward: you compute the model likelihood and partial derivatives for a randomly sampled subset of the training data.

objective function is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

9. Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging

Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	This objective function can be optimized by the stochastic gradient method or other numerical optimization methods.
Method	The squared-loss criterion1 is used to formulate the objective function .
Method	Thus, the objective function can be optimized by L-BFGS-B (Zhu et al., 1997), a generic quasi-Newton gradient-based optimizer.
Method	The first term in Equation (5) is the same as Equation (2), which is the traditional CRFs leam-ing objective function on the labeled data.
Related Work	And third, the derived label information from the graph is smoothed into the model by optimizing a modified objective function .

objective function is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

10. Using Supervised Bigram-based ILP for Extractive Summarization

Li, Chen and Qian, Xian and Liu, Yang

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Gillick and Favre (Gillick and Favre, 2009) used bigrams as concepts, which are selected from a subset of the sentences, and their document frequency as the weight in the objective function .
Proposed Method 2.1 Bigram Gain Maximization by ILP	where 0;, is an auxiliary variable we introduce that is equal to \|nbflaef — :8 2(3) * 715,8 , and nbyef is a constant that can be dropped from the objective function .
Proposed Method 2.1 Bigram Gain Maximization by ILP	To train this regression model using the given reference abstractive summaries, rather than trying to minimize the squared error as typically done, we propose a new objective function .
Proposed Method 2.1 Bigram Gain Maximization by ILP	The objective function for training is thus to minimize the KL distance:
Related Work	They used a modified objective function in order to consider whether the selected sentence is globally optimal.

objective function is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

bigrams (114)
ILP (67)
regression model (16)

11. Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty

Tsuruoka, Yoshimasa and Tsujii, Jun'ichi and Ananiadou, Sophia

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Also, SGD is very easy to implement because it does not need to use the Hessian information on the objective function .
Log-Linear Models	SGD uses a small randomly-selected subset of the training samples to approximate the gradient of the objective function given by Equation 2.
Log-Linear Models	The learning rate parameters for SGD were then tuned in such a way that they maximized the value of the objective function in 30 passes.
Log-Linear Models	Figure 3 shows how the value of the objective function changed as the training proceeded.

objective function is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

12. Multilingual Models for Compositional Distributed Semantics

Hermann, Karl Moritz and Blunsom, Phil

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach	Further, these approaches typically depend on specific semantic signals such as sentiment- or topic-labels for their objective functions .
Approach	This results in the following objective function:
Approach	The objective function in Equation 2 could be coupled with any two given vector composition functions f, g from the literature.
Conclusion	To summarize, we have presented a novel method for learning multilingual word embeddings using parallel data in conjunction with a multilingual objective function for compositional vector models.
Overview	We describe a multilingual objective function that uses a noise-contrastive update between semantic representations of different languages to learn these word embeddings.

objective function is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

13. Character-Level Machine Translation Evaluation for Languages with Ambiguous Word Boundaries

Liu, Chang and Ng, Hwee Tou

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion and Future Work	Accordingly, our objective function is replaced by:
The Algorithm	3.4 The Objective Function
The Algorithm	We now define our objective function in terms of the variables.
The Algorithm	We are also constrained by the linear programming framework, hence we set the objective function as

objective function is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

BLEU (19)
word-level (15)
n-grams (12)

14. Maximum Expected BLEU Training of Phrase and Lexicon Translation Models

He, Xiaodong and Deng, Li

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Objective function We denote by 0 the set of all the parameters to be optimized, including forward phrase and lexicon translation probabilities and their backward counterparts.
Abstract	Therefore, we design the objective function to be maximized as:
Abstract	First, we propose a new objective function (Eq.

objective function is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

15. Efficient Tree-based Approximation for Entailment Graph Learning

Berant, Jonathan and Dagan, Ido and Adler, Meni and Goldberger, Jacob

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	The objective function is the sum of weights over the edges of g and the constraint 137;]- + mjk — mik g 1 on the binary variables enforces that whenever 137; j = mjk = 1, then also 137;], = 1 (transitivity).
Sequential Approximation Algorithms	Then, at each iteration a single node v is reattached (see below) to the FRG in a way that improves the objective function .
Sequential Approximation Algorithms	This is repeated until the value of the objective function cannot be improved anymore by reattaching a node.
Sequential Approximation Algorithms	Clearly, at each reattachment the value of the objective function cannot decrease, since the optimization algorithm considers the previous graph as one of its candidate solutions.

objective function is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

16. Efficient, Feature-based, Conditional Random Field Parsing

Finkel, Jenny Rose and Kleeman, Alex and Manning, Christopher D.

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Stochastic Optimization Methods	Stochastic optimization methods have proven to be extremely efficient for the training of models involving computationally expensive objective functions like those encountered with our task (Vishwanathan et al., 2006) and, in fact, the online backpropagation learning used in the neural network parser of Henderson (2004) is a form of stochastic gradient descent.
Stochastic Optimization Methods	In our experiments SGD converged to a lower objective function value than L-BFGS, however it required far
Stochastic Optimization Methods	Utilization of stochastic optimization routines requires the implementation of a stochastic objective function .
The Model	2.2 Computing the Objective Function

objective function is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

17. Bootstrapping via Graph Propagation

Whitney, Max and Sarkar, Anoop

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	It is a bootstrapping learning method which uses a graph propagation algorithm with a well defined objective function .
Existing algorithms 3.1 Yarowsky	(2007) provide an objective function for this algorithm using a generalized definition of cross-entropy in terms of Bregman distance, which motivates our objective in section 4.
Graph propagation	6.5 Objective function
Introduction	Variants of this algorithm have been formalized as optimizing an objective function in previous work by Abney (2004) and Haffari and Sarkar (2007), but it is not clear that any perform as well as the Yarowsky algorithm itself.
Introduction	well-understood as minimizing an objective function at each iteration, and it obtains state of the art performance on several different NLP data sets.

objective function is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

18. Connotation Lexicon: A Dash of Sentiment Beneath the Surface Meaning

Feng, Song and Kang, Jun Seok and Kuznetsova, Polina and Choi, Yejin

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Connotation Induction Algorithms	Objective function : We aim to maximize: F : (pprosody + (pcoord + (Dneu
Connotation Induction Algorithms	(Dneu : a Z wzged _ zj m Soft constraints (edge weights): The weights in the objective function are set as follows:
Precision, Coverage, and Efficiency	Objective function : We aim to maximize:
Precision, Coverage, and Efficiency	Hard constraints We add penalties to the objective function if the polarity of a pair of words is not consistent with its corresponding semantic relations.
Precision, Coverage, and Efficiency	Notice that dszjlr, d317,; satisfying above inequalities will be always of negative values, hence in order to maximize the objective function , the LP solver will try to minimize the absolute values of dsjj, dsgf, effectively pushing i and j toward the same polarity.

objective function is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

19. Hierarchical Summarization: Scaling Up Multi-Document Summarization

Christensen, Janara and Soderland, Stephen and Bansal, Gagan and Mausam

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	SUMMA hierarchically clusters the sentences by time, and then summarizes the clusters using an objective function that optimizes salience and coherence.
Summarizing Within the Hierarchy	4.4 Objective Function
Summarizing Within the Hierarchy	Having estimated salience, redundancy, and two forms of coherence, we can now put this information together into a single objective function that measures the quality of a candidate hierarchical summary.
Summarizing Within the Hierarchy	Intuitively, the objective function should balance salience and coherence.

objective function is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

20. Practical Very Large Scale CRFs

Lavergne, Thomas and Cappé, Olivier and Yvon, François

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conditional Random Fields	The objective function is then a smooth convex function to be minimized over an unconstrained
Conditional Random Fields	In the following, we will jointly use both penalty terms, yielding the so-called elastic net penalty (Zhou and Hastie, 2005) which corresponds to the objective function
Conditional Random Fields	However, the introduction of a 61 penalty term makes the optimization of (6) more problematic, as the objective function is no longer differentiable in 0.

objective function is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

21. Integrating history-length interpolation and classes in language modeling

Schütze, Hinrich

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	The reason is that the objective function maximizes mutual information.
Experimental Setup	Highly differentiated classes for frequent words contribute substantially to this objective function whereas putting all rare words in a few large clusters does not hurt the objective much.
Experimental Setup	However, our focus is on using clustering for improving prediction for rare events; this means that the objective function is counterproductive when contexts are frequency-weighted as they occur in the corpus.

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

22. Joint Inference for Fine-grained Opinion Extraction

Yang, Bishan and Cardie, Claire

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Model	The objective function is defined as a linear combination of the potentials from different predictors with a parameter A to balance the contribution of two components: opinion entity identification and opinion relation extraction.
Results	The objective function of ILP-W/O-ENTITY can be represented as
Results	For ILP-W-SINGLE-RE, we simply remove the variables associated with one opinion relation in the objective function (1) and constraints.
Results	The formulation of ILP-W/O-IMPLICIT—RE removes the variables associated with potential 7“,- in the objective function and corresponding constraints.

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

relation extraction (20)
CRF (14)
ILP (14)

23. Semi-supervised Learning of Dependency Parsers using Generalized Expectation Criteria

Druck, Gregory and Mann, Gideon and McCallum, Andrew

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Model parameters are estimated using a generalized expectation (GE) objective function that penalizes the mismatch between model predictions and linguistic expectation constraints.
Generalized Expectation Criteria	Generalized expectation criteria (Mann and McCallum, 2008; Druck et al., 2008) are terms in a parameter estimation objective function that express a preference on the value of a model expectation.
Generalized Expectation Criteria	2In general, the objective function could also include the likelihood of available labeled data, but throughout this paper we assume we have no parsed sentences.
Introduction	With GE we may add a term to the objective function that encourages a feature-rich CRF to match this expectation on unlabeled data, and in the process learn about related features.

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

24. Anchors Regularized: Adding Robustness and Extensibility to Scalable Topic-Modeling Algorithms

Nguyen, Thang and Hu, Yuening and Boyd-Graber, Jordan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Adding Regularization	In this section, we briefly review regularizers and then add two regularizers, inspired by Gaussian (L2, Section 3.1) and Dirichlet priors (Beta, Section 3.2), to the anchor objective function (Equation 3).
Adding Regularization	Instead of optimizing a function just of the data cc and parameters 6, f (cc, 6), one optimizes an objective function that includes a regularizer that is only a function of parameters: f (w, 6) + 716).
Adding Regularization	This requires including the topic matrix as part of the objective function .
Anchor Words: Scalable Topic Models	Once we have established the anchor objective function, in the next section we regularize the objective function .

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

25. Multi-Task Transfer Learning for Weakly-Supervised Relation Extraction

Jiang, Jing

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A multitask transfer learning solution	We learn the optimal weight vectors {fikfifzv fiT and 3 by optimizing the following objective function:
A multitask transfer learning solution	The objective function follows standard empirical risk minimization with regularization.
A multitask transfer learning solution	Recall that we impose a constraint FV = 0 when optimizing the objective function .

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

26. Post-Retrieval Clustering Using Third-Order Similarity Measures

Moreno, Jose G. and Dias, Gaël and Cleuziou, Guillaume

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Finally, the evolution of the objective function of the adapted K -means is modeled to automatically define the “best” number of clusters.
Polythetic Post-Retrieval Clustering	To assure convergence, an objective function Q is defined which decreases at each processing step.
Polythetic Post-Retrieval Clustering	The classical objective function is defined in Equation (1) where wk, is a cluster labeled k, xi 6 wk, is an object in the cluster, mm is the centroid of the cluster wk, and E(., is the Euclidean distance.
Polythetic Post-Retrieval Clustering	A direct consequence of the change in similarity measure is the definition of a new objective function Q53 to ensure convergence.

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

27. Bilingually-Guided Monolingual Dependency Grammar Induction

Liu, Kai and Lü, Yajuan and Jiang, Wenbin and Liu, Qun

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Bilingually-Guided Dependency Grammar Induction	In that case, we can use a single parameter 04 to control both weights for different objective functions .
Bilingually-Guided Dependency Grammar Induction	When 04 = 1 it is the unsupervised objective function in Formula (6).
Bilingually-Guided Dependency Grammar Induction	Contrary, if 04 = 0, it is the projection objective function (Formula (7)) for projected instances.
Unsupervised Dependency Grammar Induction	We select a simple classifier objective function as the unsupervised objective function which is instinctively in accordance with the parsing objective:

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

28. Minimized Models for Unsupervised Part-of-Speech Tagging

Ravi, Sujith and Knight, Kevin

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion	In MDL, there is a single objective function to (1) maximize the likelihood of observing the data, and at the same time (2) minimize the length of the model description (which depends on the model size).
Discussion	However, the search procedure for MDL is usually nontrivial, and for our task of unsupervised tagging, we have not found a direct objective function which we can optimize and produce good tagging results.
Small Models	Finally, we add an objective function that minimizes the number of grammar variables that are assigned a value of 1.
Small Models	objective function value of 459.3

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

29. Unsupervised Dependency Parsing with Transferring Distribution via Parallel Guidance and Entropy Regularization

Ma, Xuezhe and Xia, Fei

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For projective parsing, several algorithms (McDonald and Pereira, 2006; Carreras, 2007; Koo and Collins, 2010; Ma and Zhao, 2012) have been proposed to solve the model training problems (calculation of objective function and gradient) for different factorizations.
Our Approach	We introduce a multiplier 7 as a tradeoff between the two contributions (parallel and unsupervised) of the objective function K, and the final objective function K I has the following form:
Our Approach	To train our parsing model, we need to find out the parameters A that minimize the objective function K I in equation (11).
Our Approach	objective function and the gradient of the objective function .

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

30. A Provably Correct Learning Algorithm for Latent-Variable PCFGs

Cohen, Shay B. and Collins, Michael

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Additional Details of the Algorithm	Next, we modify the objective function in Eq. '
Additional Details of the Algorithm	Thus the new objective function consists of a sun of L x M 2 terms, each corresponding to a differen combination of inside and outside features.
Introduction	2) Optimization of a convex objective function using EM.
The Learning Algorithm for L-PCFGS	Step 2: Use the EM algorithm to find 75 values that maximize the objective function in Eq.

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

31. Is this a wampimuk? Cross-modal mapping between distributional semantics and the visual world

Lazaridou, Angeliki and Bruni, Elia and Baroni, Marco

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	The weights are estimated by minimizing the objective function
Results	(2013), however, our objective function yielded consistently better results in all experimental settings.
Results	8For this post-hoc analysis, we include a sparsity parameter in the objective function of Equation 5 in order to get more interpretable results; hidden units are therefore maximally activated by a only few concepts.
Results	The adaptation of NN is straightforward; the new objective function is derived as

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

32. Interpretable Semantic Vectors from a Joint Model of Brain- and Text- Based Meaning

Fyshe, Alona and Talukdar, Partha P. and Murphy, Brian and Mitchell, Tom M.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Results	For a given value of 6 we solve the NNSE(Text) and J NNSE(Brain+Text) objective function as detailed in Equation 1 and 4 respectively.
Joint NonNegative Sparse Embedding	new objective function is:
Joint NonNegative Sparse Embedding	With A or D fixed, the objective function for NNSE(Text) and JNNSE(Brain+Text) is convex.
NonNegative Sparse Embedding	NNSE solves the following objective function:

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

33. Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection

Sun, Xu and Wang, Houfeng and Li, Wenjie

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related Work	The SGD uses a small randomly-selected subset of the training samples to approximate the gradient of an objective function .
System Architecture	.n, parameter estimation is performed by maximizing the objective function,
System Architecture	The final objective function is as follows:
System Architecture	t E('wt) = 'w* + H (I — vofimH(w))('wo — 10), m=1 where w* is the optimal weight vector, and H is the Hessian matrix of the objective function .

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

34. Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons

Ravi, Sujith and Baldridge, Jason and Knight, Kevin

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Thus, the better starting point provided by EMGI has more impact than the integer program that includes G1 in its objective function .
Minimized models for supertagging	There are two complementary ways to use grammar-informed initialization with the IP-minimization approach: (1) using EMGI output as the starting grammar/lexicon and (2) using the tag transitions directly in the IP objective function .
Minimized models for supertagging	For the second, we modify the objective function used in the two IP-minimization steps to be:
Minimized models for supertagging	In this way, we combine the minimization and GI strategies into a single objective function that finds a minimal grammar set while keeping the more likely tag bigrams in the chosen solution.

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

35. Cross-Lingual Latent Topic Extraction

Zhang, Duo and Mei, Qiaozhu and Zhai, ChengXiang

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Probabilistic Cross-Lingual Latent Semantic Analysis	Putting L(C) and R(C) together, we would like to maximize the following objective function which is a regularized log-likelihood:
Probabilistic Cross-Lingual Latent Semantic Analysis	Specifically, we will search for a set of values for all our parameters that can maximize the objective function defined above.
Probabilistic Cross-Lingual Latent Semantic Analysis	However, there is no closed form solution in the M-step for the whole objective function .

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

36. Learning Word Vectors for Sentiment Analysis

Maas, Andrew L. and Daly, Raymond E. and Pham, Peter T. and Huang, Dan and Ng, Andrew Y. and Potts, Christopher

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	The full objective function of the model thus learns semantic vectors that are imbued with nuanced sentiment information.
Our Model	We can efficiently learn parameters for the joint objective function using alternating maximization.
Our Model	This produces a final objective function of,
Related work	We adopt this insight, but we are able to incorporate it directly into our model’s objective function .

objective function is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

37. XMEANT: Better semantic MT evaluation without reference translations

Lo, Chi-kiu and Beloucif, Meriem and Saers, Markus and Wu, Dekai

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	However, to go beyond tuning weights in the loglinear SMT model, a cross-lingual objective function that can deeply integrate semantic frame criteria into the MT training pipeline is needed.
Conclusion	While monolingual MEANT alone accurately reflects adequacy via semantic frames and optimizing SMT against MEANT improves translation, the new cross-lingual XMEANT semantic objective function moves closer toward deep integration of semantics into the MT training pipeline.
Introduction	In order to continue driving MT towards better translation adequacy by deeply integrating semantic frame criteria into the MT training pipeline, it is necessary to have a cross-lingual semantic objective function that assesses the semantic frame similarities of input and output sentences.

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

38. Fast and Robust Neural Network Joint Models for Statistical Machine Translation

Devlin, Jacob and Zbib, Rabih and Huang, Zhongqiang and Lamar, Thomas and Schwartz, Richard and Makhoul, John

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Model Variations	For MT feature weight optimization, we use iterative k-best optimization with an Expected-BLEU objective function (Rosti et al., 2010).
Neural Network Joint Model (NNJ M)	While we cannot train a neural network with this guarantee, we can explicitly encourage the log-softmaX normalizer to be as close to 0 as possible by augmenting our training objective function:
Neural Network Joint Model (NNJ M)	Note that 04 = 0 is equivalent to the standard neural network objective function .

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

39. A Discriminative Graph-Based Parser for the Abstract Meaning Representation

Flanigan, Jeffrey and Thomson, Sam and Carbonell, Jaime and Dyer, Chris and Smith, Noah A.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Relation Identification	The score of graph G (encoded as 2) can be written as the objective function quz, where gbe = ¢Tg(e).
Relation Identification	To handle the constraint Az g b, we introduce multipliers p 2 0 to get the Lagrangian relaxation of the objective function:
Relation Identification	L(z) is an upper bound on the unrelaxed objective function quz, and is equal to it if and only if the constraints AZ g b are satisfied.

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

40. Predicting Instructor's Intervention in MOOC forums

Chaturvedi, Snigdha and Goldwasser, Dan and Daumé III, Hal

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Intervention Prediction Models	Similar to the traditional maximum margin based Support Vector Machine (SVM) formulation, our model’s objective function is defined as:
Intervention Prediction Models	Replacing the term fw (253,193) with the contents of Equation 1 in the minimization objective above, reveals the key difference from the traditional SVM formulation - the objective function has a maximum term inside the global minimization problem making it non-convex.
Intervention Prediction Models	The algorithm then performs two step iteratively - first it determines the structural assignments for the negative examples, and then optimizes the fixed objective function using a cutting plane algorithm.

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

41. A Constrained Viterbi Relaxation for Bidirectional Word Alignment

Chang, Yin-Wen and Rush, Alexander M. and DeNero, John and Collins, Michael

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	Given a sentence e of length \|e\| = I and a sentence f of length \|f\| = J, our goal is to find the best bidirectional alignment between the two sentences under a given objective function .
Background	The HMM objective function f : X —> R can be written as a linear function of :c
Background	Similarly define the objective function

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

42. ConnotationWordNet: Learning Connotation over the Word+Sense Network

Kang, Jun Seok and Feng, Song and Akoglu, Leman and Choi, Yejin

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Pairwise Markov Random Fields and Loopy Belief Propagation	We next define our objective function .
Pairwise Markov Random Fields and Loopy Belief Propagation	and x to observed ones X (variables with known labels, if any), our objective function is associated with the following joint probability distribution
Pairwise Markov Random Fields and Loopy Belief Propagation	Finding the best assignments to unobserved variables in our objective function is the inference problem.

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

synsets (24)
word-level (15)
WordNet (11)

43. Bilingually-constrained Phrase Embeddings for Machine Translation

Zhang, Jiajun and Liu, Shujie and Li, Mu and Zhou, Ming and Zong, Chengqing

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Bilingually-constrained Recursive Auto-encoders	After that, we introduce the BRAE on the network structure, objective function and parameter inference.
Bilingually-constrained Recursive Auto-encoders	In the semi-supervised RAE for phrase embedding, the objective function over a (phrase, label) pair (av, 25) includes the reconstruction error and the prediction error, as illustrated in Fig.
Bilingually-constrained Recursive Auto-encoders	3.3.1 The Objective Function

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

44. Political Ideology Detection Using Recursive Neural Networks

Iyyer, Mohit and Enns, Peter and Boyd-Graber, Jordan and Resnik, Philip

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Datasets	Due to this discrepancy, the objective function in Eq.
Experiments	For this model, we also introduce a hyperparameter 6 that weights the error at annotated nodes (1 — 6) higher than the error at unannotated nodes (6); since we have more confidence in the annotated labels, we want them to contribute more towards the objective function .
Recursive Neural Networks	This induces a supervised objective function over all sentences: a regularized sum over all node losses normalized by the number of nodes N in the training set,

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

45. Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the l0-norm

Vaswani, Ashish and Huang, Liang and Chiang, David

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Method	With the addition of the (0 prior, the MAP (maximum a posteriori) objective function is
Method	Let F (6) be the objective function in
Method	(Note that we don’t allow m = 0 because this can cause 6" + 6m to land on the boundary of the probability simplex, where the objective function is undefined.)

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

46. Bayesian Learning of Non-Compositional Phrases with Synchronous Parsing

Zhang, Hao and Quirk, Chris and Moore, Robert C. and Gildea, Daniel

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	By both changing the objective function to include a bias toward sparser models and improving the pruning techniques and efficiency, we achieve significant gains on test data with practical speed.
Experiments	Given an unlimited amount of time, we would tune the prior to maximize end-to-end performance, using an objective function such as BLEU.
Phrasal Inversion Transduction Grammar	First we change the objective function by incorporating a prior over the phrasal parameters.

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

47. Active Learning for Multilingual Statistical Machine Translation

Haffari, Gholamreza and Sarkar, Anoop

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

AL-SMT: Multilingual Setting	This goal is formalized by the following objective function:
AL-SMT: Multilingual Setting	The nonnegative weights ad reflect the importance of the different translation tasks and 2d ad 2 l. AL-SMT formulation for single language pair is a special case of this formulation where only one of the ad’s in the objective function (1) is one and the rest are zero.
Sentence Selection: Multiple Language Pairs	The goal is to optimize the objective function (1) with minimum human effort in providing the translations.

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

48. Heterogeneous Transfer Learning for Image Clustering via the SocialWeb

Yang, Qiang and Chen, Yuqiang and Xue, Gui-Rong and Dai, Wenyuan and Yu, Yong

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Image Clustering with Annotated Auxiliary Data	Based on the graphical model representation in Figure 3, we derive the log-likelihood objective function , in a similar way as in (Cohn and Hofmann, 2000), as follows
Image Clustering with Annotated Auxiliary Data	objective function ignores all the biases from the
Image Clustering with Annotated Auxiliary Data	points based on the objective function £ in Equation (5).

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

49. Phrase-Based Statistical Machine Translation as a Traveling Salesman Problem

Zaslavskiy, Mikhail and Dymetman, Marc and Cancedda, Nicola

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	SMT comparably or better than the state-of-the-art beam-search strategy, converging on solutions with higher objective function in a shorter time.
Experiments	Both algorithms do not show any clear score improvement with increasing running time which suggests that the decoder’s objective function is not very well correlated with the BLEU score on this corpus.
The Traveling Salesman Problem and its variants	LK works by generating an initial random feasible solution for the TSP problem, and then repeatedly identifying an ordered subset of k edges in the current tour and an ordered subset of k edges not included in the tour such that when they are swapped the objective function is improved.

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

language model (14)
BLEU (11)
LM (11)

50. Application-driven Statistical Paraphrase Generation

Zhao, Shiqi and Lan, Xiang and Liu, Ting and Li, Sheng

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Statistical Paraphrase Generation	In SMT, however, the optimization objective function in MERT is the MT evaluation criteria, such as BLEU.
Statistical Paraphrase Generation	We therefore introduce a new optimization objective function in this paper.
Statistical Paraphrase Generation	Replacement f-measure (rf): We use rf as the optimization objective function in MERT, which is similar to the conventional f-measure and lever-agesrp and rr: 7“f = (2 X 7”]?

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

51. Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections

Das, Dipanjan and Petrov, Slav

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

PCS Induction	We trained this model by optimizing the following objective function:
PCS Projection	The first term in the objective function is the graph smoothness regularizer which encourages the distributions of similar vertices (large wij) to be similar.
PCS Projection	While it is possible to derive a closed form solution for this convex objective function , it would require the inversion of a matrix of order \|Vf\|.

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

52. Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation

Titov, Ivan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Constraints on Inter-Domain Variability	We augment the multi-conditional log-likelihood L(6, 04) with the weighted regularization term G(6) to get the composite objective function:
Empirical Evaluation	The initial learning rate and the weight decay (the inverse squared variance of the Gaussian prior) were set to 0.01, and both parameters were reduced by the factor of 2 every iteration the objective function estimate went down.
Learning and Inference	The stochastic gradient descent algorithm iterates over examples and updates the weight vector based on the contribution of every considered example to the objective function L R(6, 04, 6).

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

53. Modeling Sentences in the Latent Space

Guo, Weiwei and Diab, Mona

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Limitations of Topic Models and LSA for Modeling Sentences	In effect, LSA allows missing and observed words to equally impact the objective function .
Limitations of Topic Models and LSA for Modeling Sentences	Moreover, the true semantics of the concept definitions is actually related to some missing words, but such true semantics will not be favored by the objective function , since equation 2 allows for too strong an impact by Xij = 0 for any missing word.
The Proposed Approach	The model parameters (vectors in P and Q) are optimized by minimizing the objective function:

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

54. Domain-Independent Abstract Generation for Focused Meeting Summarization

Wang, Lu and Cardie, Claire

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Surface Realization	Input : relation instances R = {(indi,argi>}£f=1, generated abstracts A = {absfijib objective function f , cost function G
Surface Realization	We employ the following objective function:
Surface Realization	Algorithm 1 sequentially finds an abstract with the greatest ratio of objective function gain to length, and add it to the summary if the gain is nonnegative.

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

55. An improved MDL-based compression algorithm for unsupervised word segmentation

Chen, Ruey-Cheng

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Our analysis shows that its objective function can be efficiently approximated using the negative empirical pointwise mutual information.
Concluding Remarks	In this paper, we derive a new lower-bound approximate to the objective function used in the regularized compression algorithm.
Proposed Method	The new objective function is written out as Equation (4).

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

56. Linking Tweets to News: A Framework to Enrich Short Text Data in Social Media

Guo, Weiwei and Li, Hao and Ji, Heng and Diab, Mona

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

WTMF on Graphs	To implement this, we add a regularization term in the objective function of WTMF (equation 2) for each linked pairs ijl , 6233-2:
WTMF on Graphs	Therefore we approximate the objective function by treating the vector length \|Q.,j as fixed values during the ALS iterations:
Weighted Textual Matrix Factorization	P and Q are optimized by minimize the objective function:

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

57. The Role of Syntax in Vector Space Models of Compositional Semantics

Hermann, Karl Moritz and Blunsom, Phil

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Learning	The gradient of the regularised objective function then becomes:
Learning	We learn the gradient using backpropagation through structure (Goller and Kuchler, 1996), and minimize the objective function using L-BFGS.
Learning	pred(l=l\|v, 6) = Singid(Wlabel ’U + blabel) (9) Given our corpus of CCG parses with label pairs (N, l), the new objective function becomes:

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

CCG (37)
recursive (10)
embeddings (9)

58. Deceptive Answer Prediction with User Preference Graph

Li, Fangtao and Gao, Yang and Zhou, Shuchang and Si, Xiance and Dai, Decheng

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Deceptive Answer Prediction with User Preference Graph	The best parameters w* can be found by minimizing the following objective function:
Deceptive Answer Prediction with User Preference Graph	The new objective function is changed as:
Deceptive Answer Prediction with User Preference Graph	In the above objective function , we impose a user graph regularization term

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

59. Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions

Lu, Xiaoming and Xie, Lei and Leung, Cheung-Chi and Ma, Bin and Li, Haizhou

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Our Proposed Approach	The objective function can be transformed
Our Proposed Approach	Given the low-dimensional semantic representation of the test data, an objective function can be defined as follows:
Our Proposed Approach	The story boundaries which minimize the objective function 8 in Eq.

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

60. Exact Maximum Inference for the Fertility Hidden Markov Model

Quirk, Chris

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

HMM alignment	Let us rewrite the objective function as follows:
HMM alignment	Note how this recovers the original objective function when matching variables are found.
Introduction	This captures the positional information in the IBM models in a framework that admits exact parameter estimation inference, though the objective function is not concave: local maxima are a concern.

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

61. Parsing with Compositional Vector Grammars

Socher, Richard and Bauer, John and Manning, Christopher D. and Andrew Y., Ng

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	We will first briefly introduce single word vector representations and then describe the CVG objective function , tree scoring and inference.
Introduction	The main objective function in Eq.
Introduction	The objective function is not differentiable due to the hinge loss.

objective function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: