Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition
Lan, Man and Xu, Yu and Niu, Zhengyu

Article Structure

Abstract

To overcome the shortage of labeled data for implicit discourse relation recognition, previous works attempted to automatically generate training data by removing explicit discourse connectives from sentences and then built models on these synthetic implicit examples.

Introduction

The task of implicit discourse relation recognition is to identify the type of discourse relation (a.k.a.

Related Work

2.1 Implicit discourse relation classification 2.1.1 Unsupervised approaches

Multitask Learning for Discourse Relation Prediction

3.1 Motivation

Implementation Details of Multitask Learning Method

4.1 Data sets for main and auxiliary tasks

Experiments and Results

5.1 Experiments

Conclusions

In this paper, we present a multitask learning method to improve implicit discourse relation classification by leveraging synthetic implicit discourse data.

Topics

labeled data

Appears in 8 sentences as: labeled data (9)
In Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition
  1. To overcome the shortage of labeled data for implicit discourse relation recognition, previous works attempted to automatically generate training data by removing explicit discourse connectives from sentences and then built models on these synthetic implicit examples.
    Page 1, “Abstract”
  2. Following these two principles, we create the auxiliary tasks by generating automatically labeled data as follows.
    Page 4, “Multitask Learning for Discourse Relation Prediction”
  3. Previous work (Marcu and Echihabi, 2002) and (Sporleder and Lascarides, 2008) adopted predefined pattern-based approach to generate synthetic labeled data , where each predefined pattern has one discourse relation label.
    Page 4, “Multitask Learning for Discourse Relation Prediction”
  4. In contrast, we adopt an automatic approach to generate synthetic labeled data , where each discourse connective between two texts serves as their relation label.
    Page 4, “Multitask Learning for Discourse Relation Prediction”
  5. Based on this mapping between connective and relation, we extract the synthetic labeled data containing the connective as training data for auxiliary tasks.
    Page 4, “Multitask Learning for Discourse Relation Prediction”
  6. nectives and relations in PDTB and generate synthetic labeled data by removing the connectives.
    Page 5, “Implementation Details of Multitask Learning Method”
  7. BLLIP North American News Text (Complete) is used as unlabeled data source to generate synthetic labeled data .
    Page 5, “Implementation Details of Multitask Learning Method”
  8. In comparison with the synthetic labeled data generated from the explicit relations in PDTB, the synthetic labeled data from BLLIP contains more noise.
    Page 5, “Implementation Details of Multitask Learning Method”

See all papers in Proc. ACL 2013 that mention labeled data.

See all papers in Proc. ACL that mention labeled data.

Back to top.

manually annotated

Appears in 7 sentences as: manually annotated (7)
In Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition
  1. To overcome the shortage of manually annotated training data, (Marcu and Echihabi, 2002) proposed a pattern-based approach to automatically generate training data from raw corpora.
    Page 1, “Introduction”
  2. Later, with the release of manually annotated corpus, such as Penn Discourse Treebank 2.0 (PDTB) (Prasad et al., 2008), recent studies performed implicit discourse relation recognition on natural (i.e., genuine) implicit discourse data (Pitler et al., 2009) (Lin et al., 2009) (Wang et al., 2010) with the use of linguistically informed features and machine learning algorithms.
    Page 1, “Introduction”
  3. 2002) and (Sporleder and Lascarides, 2008), and (2) manually annotated explicit data with the removal of explicit discourse connectives.
    Page 2, “Introduction”
  4. Among them, it is manually annotated as an Expansion relation for 2, 938 times.
    Page 4, “Multitask Learning for Discourse Relation Prediction”
  5. This is because the former data is manually annotated whether a word serves as discourse connective or not, while the latter does not manually disambiguate two types of ambiguity, i.e., whether a word serves as discourse connective or not, and the type of discourse relation if it is a discourse connective.
    Page 5, “Implementation Details of Multitask Learning Method”
  6. This is contrary to our original expectation that exp data which has been manually annotated for discourse connective disambiguation should outperform BLLIP which contains a lot of noise.
    Page 8, “Experiments and Results”
  7. This finding indicates that under the multitask learning, it may not be worthy of using manually annotated corpus to generate auxiliary data.
    Page 8, “Experiments and Results”

See all papers in Proc. ACL 2013 that mention manually annotated.

See all papers in Proc. ACL that mention manually annotated.

Back to top.

semi-supervised

Appears in 4 sentences as: Semi-supervised (1) semi-supervised (3)
In Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition
  1. 2.1.3 Semi-supervised approaches
    Page 3, “Related Work”
  2. (Hernault et al., 2010) presented a semi-supervised method based on the analysis of co-occurring features in labeled and unlabeled data.
    Page 3, “Related Work”
  3. Very recently, (Hernault et al., 2011) introduced a semi-supervised work using structure learning method for discourse relation classification, which is quite relevant to our work.
    Page 3, “Related Work”
  4. ASO has been shown to be useful in a semi-supervised learning configuration for several NLP applications, such as, text chunking (Ando and Zhang, 2005b) and text classification (Ando and Zhang, 2005a).
    Page 4, “Multitask Learning for Discourse Relation Prediction”

See all papers in Proc. ACL 2013 that mention semi-supervised.

See all papers in Proc. ACL that mention semi-supervised.

Back to top.

unlabeled data

Appears in 4 sentences as: unlabeled data (4)
In Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition
  1. Due to the lack of benchmark data for implicit discourse relation analysis, earlier work used unlabeled data to generate synthetic implicit discourse data.
    Page 2, “Related Work”
  2. Research work in this category exploited both labeled and unlabeled data for discourse relation prediction.
    Page 3, “Related Work”
  3. (Hernault et al., 2010) presented a semi-supervised method based on the analysis of co-occurring features in labeled and unlabeled data .
    Page 3, “Related Work”
  4. BLLIP North American News Text (Complete) is used as unlabeled data source to generate synthetic labeled data.
    Page 5, “Implementation Details of Multitask Learning Method”

See all papers in Proc. ACL 2013 that mention unlabeled data.

See all papers in Proc. ACL that mention unlabeled data.

Back to top.

binary classification

Appears in 3 sentences as: binary classification (2) binary classifier (1)
In Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition
  1. Specifically, we adopt multiple binary classification to build model for main task.
    Page 6, “Implementation Details of Multitask Learning Method”
  2. That is, for each discourse relation, we build a binary classifier .
    Page 6, “Implementation Details of Multitask Learning Method”
  3. Although previous work has been done on PDTB (Pitler et al., 2009) and (Lin et al., 2009), we cannot make a direct comparison with them because various experimental conditions, such as, different classification strategies (multi-class classification, multiple binary classification ), different data preparation (feature extraction and selection), different benchmark data collections (different sections for training and test, different levels of discourse relations), different classifiers with various parameters (MaxEnt, Na‘1've Bayes, SVM, etc) and
    Page 6, “Experiments and Results”

See all papers in Proc. ACL 2013 that mention binary classification.

See all papers in Proc. ACL that mention binary classification.

Back to top.

machine learning

Appears in 3 sentences as: machine learning (3)
In Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition
  1. Later, with the release of manually annotated corpus, such as Penn Discourse Treebank 2.0 (PDTB) (Prasad et al., 2008), recent studies performed implicit discourse relation recognition on natural (i.e., genuine) implicit discourse data (Pitler et al., 2009) (Lin et al., 2009) (Wang et al., 2010) with the use of linguistically informed features and machine learning algorithms.
    Page 1, “Introduction”
  2. In their work, they collected word pairs from synthetic data set as features and used machine learning method to classify implicit discourse relation.
    Page 2, “Related Work”
  3. Multitask learning is a kind of machine learning method, which learns a main task together with
    Page 3, “Related Work”

See all papers in Proc. ACL 2013 that mention machine learning.

See all papers in Proc. ACL that mention machine learning.

Back to top.

model trained

Appears in 3 sentences as: model trained (2) models trained (1)
In Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition
  1. However, a previous study (Sporleder and Lascarides, 2008) showed that models trained on these synthetic data do not generalize very well to natural (i.e.
    Page 1, “Abstract”
  2. Unlike their previous work, our previous work (Zhou et al., 2010) presented a method to predict the missing connective based on a language model trained on an unannotated corpus.
    Page 3, “Related Work”
  3. However, (Sporleder and Lascarides, 2008) found that the model trained on synthetic implicit data has not performed as well as expected in natural implicit data.
    Page 3, “Multitask Learning for Discourse Relation Prediction”

See all papers in Proc. ACL 2013 that mention model trained.

See all papers in Proc. ACL that mention model trained.

Back to top.