Text-level Discourse Parsing with Rich Linguistic Features
Feng, Vanessa Wei and Hirst, Graeme

Article Structure

Abstract

In this paper, we develop an RST—style text-level discourse parser, based on the HILDA discourse parser (Hernault et al., 2010b).

Introduction

In a well-written text, no unit of the text is completely isolated; interpretation requires understanding the unit’s relation with the context.

Discourse-annotated corpora

2.1 The RST Discourse Treebank

Related work

Discourse parsing was first brought to prominence by Marcu (1997).

Text-level discourse parsing

Not until recently has discourse parsing for full texts been a research focus — previously, discourse parsing was only performed on the sentence levell.

Method

We use the HILDA discourse parser of Hernault et al.

Experiments

As discussed in Section 5.1, our research focus in this paper is the tree-building step of the HILDA discourse parser, which consists of two classifications: Structure and Relation classification.

Conclusions

In this paper, we aimed to develop an RST—style text-level discourse parser.

Topics

discourse parsing

Appears in 32 sentences as: discourse parser (19) discourse parsers (2) Discourse parsing (1) discourse parsing (20)
In Text-level Discourse Parsing with Rich Linguistic Features
  1. In this paper, we develop an RST—style text-level discourse parser, based on the HILDA discourse parser (Hernault et al., 2010b).
    Page 1, “Abstract”
  2. We also analyze the difficulty of extending traditional sentence-level discourse parsing to text-level parsing by comparing discourse-parsing performance under different discourse conditions.
    Page 1, “Abstract”
  3. Research in discourse parsing aims to unmask such relations in text, which is helpful for many downstream applications such as summarization, information retrieval, and question answering.
    Page 1, “Introduction”
  4. However, most existing discourse parsers operate on individual sentences alone, whereas discourse parsing is more powerful for text-level analysis.
    Page 1, “Introduction”
  5. Therefore, in this work, we aim to develop a text-level discourse parser .
    Page 1, “Introduction”
  6. We follow the framework of Rhetorical Structure Theory (Mann and Thompson, 1988) and we take the HILDA discourse parser (Hernault et al., 2010b) as the basis of our work, because it is the first fully implemented text-level discourse parser with state-of-the-art performance.
    Page 1, “Introduction”
  7. difficulty with extending traditional sentence-level discourse parsing to text-level parsing, by comparing discourse parsing performance under different discourse conditions.
    Page 1, “Introduction”
  8. Discourse parsing was first brought to prominence by Marcu (1997).
    Page 2, “Related work”
  9. Here we briefly review two fully implemented text-level discourse parsers with the state-of-the-art performance.
    Page 2, “Related work”
  10. The HILDA discourse parser of Hemault and his colleagues (duVerle and Prendinger, 2009; Hernault et al., 2010b) is the first fully-implemented feature-based discourse parser that works at the full text level.
    Page 2, “Related work”
  11. Subsequently, they fully implemented an end-to-end PDTB-style discourse parser (Lin et al., 2010).
    Page 2, “Related work”

See all papers in Proc. ACL 2012 that mention discourse parsing.

See all papers in Proc. ACL that mention discourse parsing.

Back to top.

EDUs

Appears in 11 sentences as: EDUs (12)
In Text-level Discourse Parsing with Rich Linguistic Features
  1. In the framework of RST, a coherent text can be represented as a discourse tree whose leaves are non-overlapping text spans called elementary discourse units ( EDUs ); these are the minimal text units of discourse trees.
    Page 1, “Discourse-annotated corpora”
  2. The example text fragment shown in Figure 1 consists of four EDUs (e1-e4), segmented by square brackets.
    Page 1, “Discourse-annotated corpora”
  3. The two EDUs e1 and eg are related by a mononuclear relation ATTRIBUTION, where e1 is the more salient span; the span (e1-e2) and the EDU e3 are related by a multi-nuclear relation SAME-UNIT, where they are equally salient.
    Page 1, “Discourse-annotated corpora”
  4. Figure 1: An example text fragment (wsj_0616) composed of four EDUs , and its RST discourse tree representation.
    Page 2, “Discourse-annotated corpora”
  5. The two EDUs associated with each sentence are coherent themselves, whereas the combination of the two sentences is not coherent at the sentence boundary.
    Page 3, “Text-level discourse parsing”
  6. Following the methodology of HILDA, an input text is first segmented into EDUs .
    Page 4, “Method”
  7. Then, from the EDUs , a bottom-up approach is applied to build a discourse tree for the full text.
    Page 4, “Method”
  8. Initially, a binary Structure classifier evaluates whether a discourse relation is likely to hold between consecutive EDUs .
    Page 4, “Method”
  9. The two EDUs which are most probably connected by a discourse relation are merged into a discourse subtree of two EDUs .
    Page 4, “Method”
  10. Next, the Structure classifier and the Relation classifier are employed in cascade to reevaluate which relations are the most likely to hold between adjacent spans (discourse subtrees of any size, including atomic EDUs ).
    Page 4, “Method”
  11. In all instances, both S L and S R must correspond to a constituent in the discourse tree, which can be either an atomic EDU or a concatenation of multiple consecutive EDUs .
    Page 4, “Method”

See all papers in Proc. ACL 2012 that mention EDUs.

See all papers in Proc. ACL that mention EDUs.

Back to top.

sentence-level

Appears in 8 sentences as: sentence-level (9)
In Text-level Discourse Parsing with Rich Linguistic Features
  1. We also analyze the difficulty of extending traditional sentence-level discourse parsing to text-level parsing by comparing discourse-parsing performance under different discourse conditions.
    Page 1, “Abstract”
  2. difficulty with extending traditional sentence-level discourse parsing to text-level parsing, by comparing discourse parsing performance under different discourse conditions.
    Page 1, “Introduction”
  3. Unlike syntactic parsing, where we are almost never interested in parsing above sentence level, sentence-level parsing is not sufficient for discourse parsing.
    Page 3, “Text-level discourse parsing”
  4. While a sequence of local ( sentence-level ) grammaticality can be considered to be global grammaticality, a sequence of local discourse coherence does not necessarily form a globally coherent text.
    Page 3, “Text-level discourse parsing”
  5. Text-level discourse parsing imposes more constraints on the global coherence than sentence-level discourse parsing.
    Page 3, “Text-level discourse parsing”
  6. However, if, technically speaking, text-level discourse parsing were no more difficult than sentence-level parsing, any sentence-level discourse parser could be easily upgraded to a text-level discourse parser just by applying it to full texts.
    Page 3, “Text-level discourse parsing”
  7. In our experiments (Section 6), we show that when applied above the sentence level, the performance of discourse parsing is consistently inferior to that within individual sentences, and we will briefly discuss what the key difficulties with extending sentence-level to text-level discourse parsing are.
    Page 3, “Text-level discourse parsing”
  8. We analyzed the difficulty of extending traditional sentence-level discourse parsing to text-level parsing by showing that using exactly the same set of features, the performance of Structure and Relation classification on cross-sentence instances is consistently inferior to that on within-sentence instances.
    Page 8, “Conclusions”

See all papers in Proc. ACL 2012 that mention sentence-level.

See all papers in Proc. ACL that mention sentence-level.

Back to top.

F1 score

Appears in 6 sentences as: F1 score (5) F1 scores (1)
In Text-level Discourse Parsing with Rich Linguistic Features
  1. Performance is measured by four metrics: accuracy, precision, recall, and F1 score on the test set, shown in the first section in each subtable.
    Page 6, “Experiments”
  2. However, under this discourse condition, the distribution of positive and negative instances in both training and test sets is extremely skewed, which makes it more sensible to compare the recall and F1 scores for evaluation.
    Page 6, “Experiments”
  3. In fact, our features achieve much higher recall and F1 score despite a much lower precision and a slightly lower accuracy.
    Page 6, “Experiments”
  4. In the second section of each subtable, we also list the F1 score on the training data.
    Page 6, “Experiments”
  5. For example, looking at the training F1 score under the cross-sentence condition, we can see that classification using full features and classification without contextual features both perform significantly better on the training data than HILDA does.
    Page 7, “Experiments”
  6. (2010b), but still with a marginal, but nonetheless statistically significant, improvement on recall and F1 score .
    Page 7, “Experiments”

See all papers in Proc. ACL 2012 that mention F1 score.

See all papers in Proc. ACL that mention F1 score.

Back to top.

Treebank

Appears in 4 sentences as: Treebank (4)
In Text-level Discourse Parsing with Rich Linguistic Features
  1. 2.1 The RST Discourse Treebank
    Page 1, “Discourse-annotated corpora”
  2. The RST Discourse Treebank (RST-DT) (Carlson et al., 2001), is a corpus annotated in the framework of RST.
    Page 2, “Discourse-annotated corpora”
  3. 2.2 The Penn Discourse Treebank
    Page 2, “Discourse-annotated corpora”
  4. The Penn Discourse Treebank (PDTB) (Prasad et al., 2008) is another annotated discourse corpus.
    Page 2, “Discourse-annotated corpora”

See all papers in Proc. ACL 2012 that mention Treebank.

See all papers in Proc. ACL that mention Treebank.

Back to top.

constituent parse

Appears in 3 sentences as: constituent parse (3)
In Text-level Discourse Parsing with Rich Linguistic Features
  1. (2009) attempted to recognize implicit discourse relations (discourse relations which are not signaled by explicit connectives) in PDTB by using four classes of features — contextual features, constituent parse features, dependency parse features, and lexical features — and explored their individual influence on performance.
    Page 2, “Related work”
  2. They showed that the production rules extracted from constituent parse trees are the most effective features, while contextual features are the weakest.
    Page 2, “Related work”
  3. HILDA’s features: We incorporate the original features used in the HILDA discourse parser with slight modification, which include the following four types of features occurring in SL, SR, or both: (1) N-gram prefixes and suffixes; (2) syntactic tag prefixes and suffixes; (3) lexical heads in the constituent parse tree; and (4) PCS tag of the dominating nodes.
    Page 4, “Method”

See all papers in Proc. ACL 2012 that mention constituent parse.

See all papers in Proc. ACL that mention constituent parse.

Back to top.

dependency parse

Appears in 3 sentences as: Dependency parse (1) dependency parse (2)
In Text-level Discourse Parsing with Rich Linguistic Features
  1. (2009) attempted to recognize implicit discourse relations (discourse relations which are not signaled by explicit connectives) in PDTB by using four classes of features — contextual features, constituent parse features, dependency parse features, and lexical features — and explored their individual influence on performance.
    Page 2, “Related work”
  2. (2009), we extract the following three types of features: (1) pairs of words, one from SL and one from SR, as originally proposed by Marcu and Echihabi (2002); (2) dependency parse features in S L, S R, or both; and (3) syntactic production rules in S L, S R, or both.
    Page 4, “Method”
  3. One major reason is that many features that are predictive for within-sentence instances are no longer applicable (e.g., Dependency parse features).
    Page 7, “Experiments”

See all papers in Proc. ACL 2012 that mention dependency parse.

See all papers in Proc. ACL that mention dependency parse.

Back to top.

feature set

Appears in 3 sentences as: feature set (2) feature sets (1)
In Text-level Discourse Parsing with Rich Linguistic Features
  1. We refine Hernault et al.’s original feature set by incorporating our own features as well as some adapted from Lin et al.
    Page 3, “Method”
  2. (2009) also incorporated contextual features in their feature set .
    Page 4, “Method”
  3. us to compare the model-fitting capacity of different feature sets from another perspective, especially when the training data is not sufficiently well fitted by the model.
    Page 7, “Experiments”

See all papers in Proc. ACL 2012 that mention feature set.

See all papers in Proc. ACL that mention feature set.

Back to top.

parse trees

Appears in 3 sentences as: parse tree (1) parse trees (2)
In Text-level Discourse Parsing with Rich Linguistic Features
  1. They showed that the production rules extracted from constituent parse trees are the most effective features, while contextual features are the weakest.
    Page 2, “Related work”
  2. Since EDU boundaries are highly correlated with the syntactic structures embedded in the sentences, EDU segmentation is a relatively trivial step — using machine- generated syntactic parse trees , HILDA achieves an F -score of 93.8% for EDU segmentation.
    Page 4, “Method”
  3. HILDA’s features: We incorporate the original features used in the HILDA discourse parser with slight modification, which include the following four types of features occurring in SL, SR, or both: (1) N-gram prefixes and suffixes; (2) syntactic tag prefixes and suffixes; (3) lexical heads in the constituent parse tree ; and (4) PCS tag of the dominating nodes.
    Page 4, “Method”

See all papers in Proc. ACL 2012 that mention parse trees.

See all papers in Proc. ACL that mention parse trees.

Back to top.

significantly improve

Appears in 3 sentences as: significantly improve (2) significantly improved (1)
In Text-level Discourse Parsing with Rich Linguistic Features
  1. We significantly improve its tree-building step by incorporating our own rich linguistic features.
    Page 1, “Abstract”
  2. We significantly improve the performance of HILDA’s tree-building step (introduced in Section 5.1 below) by incorporating rich linguistic features (Section 5.3).
    Page 1, “Introduction”
  3. We chose the HILDA discourse parser (Hemault et al., 2010b) as the basis of our work, and significantly improved its tree-building step by incorporating our own rich linguistic features, together with features suggested by Lin et al.
    Page 8, “Conclusions”

See all papers in Proc. ACL 2012 that mention significantly improve.

See all papers in Proc. ACL that mention significantly improve.

Back to top.