Index of papers in Proc. ACL 2009 that mention

**loss function**

Introduction | We describe k-best decoding for our hybrid model and design its loss function and the features appropriate for our task. |

Training method | Given a training example (xt, yt), MIRA tries to establish a margin between the score of the correct path 3(xt,yt;w) and the score of the best candidate path 3(xt, j}; w) based on the current weight vector w that is proportional to a loss function L(yt, 37). |

Training method | 4.3 Loss function |

Training method | We instead compute the loss function through false positives (FF) and false negatives (FN). |

loss function is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

- POS tagging (26)
- word segmentation (14)
- word-level (12)

Discussion | In this paper, we have described how MERT can be employed to estimate the weights for the linear loss function to maximize BLEU on a development set. |

Experiments | Note that N -best MBR uses a sentence BLEU loss function . |

Minimum Bayes-Risk Decoding | This reranking can be done for any sentence-level loss function such as BLEU (Papineni et al., 2001), Word Error Rate, or Position-independent Error Rate. |

loss function is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

- n-gram (21)
- BLEU (20)
- phrase-based (11)

Variational vs. Min-Risk Decoding | They use the following loss function , of which a linear approximation to BLEU (Papineni et al., 2001) is a special case, |

Variational vs. Min-Risk Decoding | With the above loss function , Tromble et al. |

Variational vs. Min-Risk Decoding | 15 The MBR becomes the MAP decision rule of (1) if a so-called zero-one loss function is used: l(y, y’) = 0 if y = y’ ; otherwise l(y,y’) = 1. |

loss function is mentioned in 3 sentences in this paper.