Structured Learning | We will perform discriminative training using a loss function that directly measures end-to-end summarization quality. |
Structured Learning | We use bigram recall as our loss function (see Section 3.3). |
Structured Learning | Luckily, our choice of loss function , bigram recall, factors over bigrams. |
Introduction | In addition, we use a non-symmetric loss function during optimization to account for the imbalance between over-predicting or under-predicting the beam-width. |
Open/Closed Cell Classification | where H is the unit step function: 1 if the inner product 6 - a: > 0, and 0 otherwise; and L ,\(-, is an asymmetric loss function , defined below. |
Open/Closed Cell Classification | To deal with this imbalance, we introduce an asymmetric loss function L ,\(-, to penalize false-negatives more severely during training. |
Open/Closed Cell Classification | For Constituent and Complete Closure, we also vary the loss function , adjusting the relative penalty between a false-negative (closing off a chart cell that contains a maximum likelihood edge) and a false-positive. |