SciSurf: Index of "loss function" in Proc. ACL 2013

Index of papers in Proc. ACL 2013 that mention

loss function

Seen in text as:

loss function (21)
loss functions (3)

Seen in 24 sentences in 6 papers.

1. Fast and Adaptive Online Training of Feature-Rich Translation Models

Green, Spence and Wang, Sida and Cer, Daniel and Manning, Christopher D.

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Adaptive Online Algorithms	1We specify the loss function for MT in section 3.1.
Adaptive Online Algorithms	The relationship to SGD can be seen by lineariz-ing the loss function 6421)) m €t(wt_1) + (w —wt_1)TV€t(wt_1) and taking the derivative of (6).
Adaptive Online Algorithms	MIRA/AROW requires selecting the loss function 6 so that wt can be solved in closed-form, by a quadratic program (QP), or in some other way that is better than linearizing.
Adaptive Online MT	AdaGrad (lines 9—10) is a crucial piece, but the loss function , regularization technique, and parallelization strategy described in this section are equally important in the MT setting.
Adaptive Online MT	3.1 Pairwise Logistic Loss Function
Adaptive Online MT	The pairwise approach results in simple, convex loss functions suitable for online learning.

loss function is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

2. Decentralized Entity-Level Modeling for Coreference Resolution

Durrett, Greg and Hall, David and Klein, Dan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	However, we found no simple way to change the relative performance characteristics of our various systems; notably, modifying the parameters of the loss function mentioned in Section 4 or changing it entirely did not trade off these three metrics but merely increased or decreased them in lockstep.
Learning	This optimizes for the 0-1 loss; however, we are much more interested in optimizing with respect to a coreference-specific loss function .
Learning	We modify Equation 1 to use a new probability distribution P’ instead of P, where P’(a\|xi) oc P(a\|mi)exp(l(a,C)) and l(a, C) is a loss function .
Learning	In order to perform inference efficiently, l(a, 0) must decompose linearly across mentions: l(a, C) = 2:121 l(az-, C Commonly-used coreference metrics such as MUC (Vilain et al., 1995) and B3 (Bagga and Baldwin, 1998) do not have this property, so we instead make use of a parameterized loss function that does and fit the parameters to give good performance.
Related Work	Our BASIC model is a mention-ranking approach resembling models used by Denis and Baldridge (2008) and Rahman and Ng (2009), though it is trained using a novel parameterized loss function .

loss function is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

3. Learning Entity Representation for Entity Disambiguation

He, Zhengyan and Liu, Shujie and Li, Mu and Zhou, Ming and Zhang, Longkai and Wang, Houfeng

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Learning Representation for Contextual Document	The loss function is defined as negative log of 30 f tmax function:
Learning Representation for Contextual Document	The loss function is closely related to contrastive estimation (Smith and Eisner, 2005), which defines where the positive example takes probability mass from.
Learning Representation for Contextual Document	In our experiments, the so ftmazc loss function consistently outperforms pairwise ranking loss function , which is taken as our default setting.

loss function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

4. A user-centric model of voting intention from Social Media

Lampos, Vasileios and Preoţiuc-Pietro, Daniel and Cohn, Trevor

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The loss function in our evaluation is the standard Mean Square Error (MSE), but to allow a better interpretation of the results, we display its root (RMSE) in tables and figures.6
Methods	2 is the standard regularisation loss function , namely the sum squared error over the training instances.4
Methods	Biconvex functions and possible applications have been well studied in the optimisation literature (Quesada and Grossmann, 1995; 4Note that other loss functions could be used here, such as logistic loss for classification, or more generally bilinear

loss function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

5. Deceptive Answer Prediction with User Preference Graph

Li, Fangtao and Gao, Yang and Zhou, Shuchang and Si, Xiance and Dai, Decheng

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Deceptive Answer Prediction with User Preference Graph	where L(waZ-,yi) is a loss function that measures discrepancy between the predicted label wT - x,- and the true label yi, where yz- 6 {+1, —l}.
Deceptive Answer Prediction with User Preference Graph	The common used loss functions include L(p, y) = (p — y)2 (least square), L(p, y) = ln (1 + exp (—py)) (logistic regression).
Deceptive Answer Prediction with User Preference Graph	For simplicity, here we use the least square loss function .

loss function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

6. Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction

Plank, Barbara and Moschitti, Alessandro

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related Work	Instance weighting is a method for domain adaptation in which instance-dependent weights are assigned to the loss function that is minimized during the training process.
Related Work	Let [(30, y, 6) be some loss function .
Related Work	Then, as shown in Jiang and Zhai (2007), the loss function can be weighted by 6il(:c,y,6), such that [3, = where P8 and Pt are the source and target distributions, respectively.

loss function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: