Building a Discourse Parser | In our work, we focused exclusively on the second step of the discourse parsing problem, i.e., constructing the RST tree from a sequence of edus that have been segmented beforehand. |
Building a Discourse Parser | At the core of our system is a set of classifiers, trained through supervised-learning, which, given two consecutive spans (atomic edus or RST sub-trees) in an input document, will score the likelihood of a direct structural relation as well as probabilities for such a relation’s label and nuclearity. |
Building a Discourse Parser | EDUS Syntax Trees I Syntax Parsing (Charniak's n/parse) ' v I Tokenization l I Lexicalization l . |
Features | Therefore, it seems useful to encode different measures of span size and positioning, using either tokens or edus as a distance unit: |
Introduction | Segmentation of the input text into elementary discourse units ( ‘edus’ ). |
Introduction | Generation of the rhetorical structure tree based on ‘rhetorical relations’ (or ‘coherence relations’) as labels of the tree, with the edus constituting its terminal nodes. |
Introduction* | Discourse segmentation is the process of decomposing discourse into elementary discourse units ( EDUs ), which may be simple sentences or clauses in a complex sentence, and from which discourse trees are constructed. |
Principles For Discourse Segmentation | Many of our differences with Carlson and Marcu (2001), who defined EDUs for the RST Discourse Treebank (Carlson et al., 2002), are due to the fact that we adhere closer to the original RST proposals (Mann and Thompson, 1988), which defined as ‘spans’ adjunct clauses, rather than complement (subject and object) clauses. |
Principles For Discourse Segmentation | In particular, we propose that complements of attributive and cognitive verbs (He said (that)..., I think (that)...) are not EDUs . |