Experimental results and discussions 6.1 Baseline experiments | In the first set of experiments, we evaluate the baseline performance of the LM and BC summarizers (cf. |
Experimental results and discussions 6.1 Baseline experiments | Second, the supervised summarizer (i.e., BC) outperforms the unsupervised summarizer (i.e., LM ). |
Experimental results and discussions 6.1 Baseline experiments | One is that BC is trained with the handcrafted document-summary sentence labels in the development set while LM is instead conducted in a purely unsupervised manner. |
Proposed Methods | In order to estimate the sentence generative probability, we explore the language modeling ( LM ) approach, which has been introduced to a wide spectrum of IR tasks and demonstrated with good empirical success, to predict the sentence generative probability. |
Proposed Methods | In the LM approach, each sentence in a document can be simply regarded as a probabilistic generative model consisting of a unigram distribution (the so-called “bag-0f-words” assumption) for generating the document (Chen et al., 2009): (w) |
Experimental Setup and Results | As a baseline system, we built a standard phrase-based system, using the surface forms of the words without any transformations, and with a 3—gram LM in the decoder. |
Experimental Setup and Results | We believe that the use of multiple language models (some much less sparse than the surface LM ) in the factored baseline is the main reason for the improvement. |
Experimental Setup and Results | Using a 4-gram root LM , considerably less sparse than word forms but more sparse that tags, we get a BLEU score of 22.80 (max: 24.07, min: 21.57, std: 0.85). |
Experiments with Constituent Reordering | 16These experiments were done on top of the model in 3.2.3 with a 3-gram word and root LMs and 8-gram tag LM . |
The normalization models | All tokens Tj of S are concatenated together and composed with the lexical language model LM . |
The normalization models | 8’ = BestPath( (©3112) 0 LM ) (6) |
The normalization models | LM 2 FirstProjection( L o LMw ) (13) |
Experimental Evaluation | The second one, LM , is based on statistical language models for relevant information retrieval (Ponte and Croft, 1998). |
Experimental Evaluation | Okapi 0.827 0.833 0.807 0.751 Forum LM 0.804 0.833 0.807 0.731 Our 0.967 0.967 0.9 0.85 |
Experimental Evaluation | Okapi 0.733 0.651 0.667 0.466 Blog LM 0.767 0.718 0.70 0.524 Our 0.933 0.894 0.867 0.756 |