Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm
Jeon, Je Hun and Liu, Yang

Article Structure

Abstract

Most of previous approaches to automatic prosodic event detection are based on supervised learning, relying on the availability of a corpus that is annotated with the prosodic labels of interest in order to train the classification models.

Introduction

Prosody represents suprasegmental information in speech since it normally extends over more than one phoneme segment.

Corpus and tasks

In this paper, our experiments were carried out on the Boston University Radio News Corpus (BU) (Ostendorf et al., 2003) which consists of broadcast news style read speech and has ToBI-style prosodic annotations for a part of the data.

Previous work

Many previous efforts on prosodic event detection used supervised learning approaches.

Prosodic event detection method

We model the prosody detection problem as a classification task.

Co-training strategy for prosodic event detection

Co-training (Blum and Mitchell, 1998) is a semi-supervised multi-view algorithm that uses the initial training set to learn a (weak) classifier in each view.

Experiments and results

Our goal is to determine whether the co-training algorithm described above could successfully use the unlabeled data for prosodic event detection.

Conclusions

In this paper, we exploit the co-training method for automatic prosodic event detection.

Topics

semi-supervised

Appears in 10 sentences as: semi-supervised (10)
In Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm
  1. In this paper, we exploit semi-supervised learning with the co-training algorithm for automatic detection of coarse level representation of prosodic events such as pitch accents, intonational phrase boundaries, and break indices.
    Page 1, “Abstract”
  2. Limited research has been conducted using unsupervised and semi-supervised methods.
    Page 1, “Introduction”
  3. In this paper, we exploit semi-supervised learning with the
    Page 1, “Introduction”
  4. Our experiments on the Boston Radio News corpus show that the use of unlabeled data can lead to significant improvement of prosodic event detection compared to using the original small training set, and that the semi-supervised learning result is comparable with supervised learning with similar amount of training data.
    Page 2, “Introduction”
  5. Limited research has been done in prosodic detection using unsupervised or semi-supervised methods.
    Page 3, “Previous work”
  6. She also exploited a semi-supervised approach using Laplacian SVM classification on a small set of examples.
    Page 3, “Previous work”
  7. In this paper, we apply co-training algorithm to automatic prosodic event detection and propose methods to better select samples to improve semi-supervised learning performance for this task.
    Page 3, “Previous work”
  8. Co-training (Blum and Mitchell, 1998) is a semi-supervised multi-view algorithm that uses the initial training set to learn a (weak) classifier in each view.
    Page 5, “Co-training strategy for prosodic event detection”
  9. Although the test condition is different, our result is significantly better than that of other semi-supervised approaches of previous work and comparable with supervised approaches.
    Page 8, “Experiments and results”
  10. In addition, we plan to compare this to other semi-supervised learning techniques such as active learning.
    Page 8, “Conclusions”

See all papers in Proc. ACL 2009 that mention semi-supervised.

See all papers in Proc. ACL that mention semi-supervised.

Back to top.

unlabeled data

Appears in 10 sentences as: unlabeled data (11)
In Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm
  1. We propose a confidence-based method to assign labels to unlabeled data and demonstrate improved results using this method compared to the widely used agreement-based method.
    Page 1, “Abstract”
  2. In our experiments on the Boston University radio news corpus, using only a small amount of the labeled data as the initial training set, our proposed labeling method combined with most confidence sample selection can effectively use unlabeled data to improve performance and finally reach performance closer to that of the supervised method using all the training data.
    Page 1, “Abstract”
  3. We propose a confidence-based method to assign labels to unlabeled data in training iterations and evaluate its performance combined with different informative sample selection methods.
    Page 2, “Introduction”
  4. Our experiments on the Boston Radio News corpus show that the use of unlabeled data can lead to significant improvement of prosodic event detection compared to using the original small training set, and that the semi-supervised learning result is comparable with supervised learning with similar amount of training data.
    Page 2, “Introduction”
  5. Given a set L of labeled data and a set U of unlabeled data, the algorithm first creates a smaller pool U’ containing u unlabeled data .
    Page 5, “Co-training strategy for prosodic event detection”
  6. There are two issues: (1) the accurate self-labeling method for unlabeled data and (2) effective heuristics to se-
    Page 5, “Co-training strategy for prosodic event detection”
  7. Given a set L of labeled training data and a set U of unlabeled data Randomly select U’ from U, |U’ |=u while iteration < k do Use L to train classifiers h] and [12 Apply h] and 112 to assign labels for all examples in U’ Select n self-labeled samples and add to L Remove these it samples from U Recreate U’ by choosing u instances randomly from U
    Page 5, “Co-training strategy for prosodic event detection”
  8. Our goal is to determine whether the co-training algorithm described above could successfully use the unlabeled data for prosodic event detection.
    Page 6, “Experiments and results”
  9. We introduced a confidence-based method to assign possible labels to unlabeled data and evaluated the performance combined with informative sample selection methods.
    Page 8, “Conclusions”
  10. This suggests that the use of unlabeled data can lead to significant improvement for prosodic event detection.
    Page 8, “Conclusions”

See all papers in Proc. ACL 2009 that mention unlabeled data.

See all papers in Proc. ACL that mention unlabeled data.

Back to top.

labeled data

Appears in 5 sentences as: labeled data (6)
In Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm
  1. In our experiments on the Boston University radio news corpus, using only a small amount of the labeled data as the initial training set, our proposed labeling method combined with most confidence sample selection can effectively use unlabeled data to improve performance and finally reach performance closer to that of the supervised method using all the training data.
    Page 1, “Abstract”
  2. Given a set L of labeled data and a set U of unlabeled data, the algorithm first creates a smaller pool U’ containing u unlabeled data.
    Page 5, “Co-training strategy for prosodic event detection”
  3. Among labeled data , 102 utterances of all f] a and m] 19 speakers are used for testing, 20 utterances randomly chosen from f2b, f3b, m2b, m3b, and m4b are used as development set to optimize parameters such as A and confidence level threshold, 5 utterances are used as the initial training set L, and the rest of the data is used as unlabeled set U, which has 1027 unlabeled utterances (we removed the human labels for co-training experiments).
    Page 6, “Experiments and results”
  4. We can see that the performance of co-training for these three tasks is slightly worse than supervised learning using all the labeled data, but is significantly better than the original performance using 3% of hand labeled data .
    Page 8, “Experiments and results”
  5. In our experiment, we used some labeled data as development set to estimate some parameters.
    Page 8, “Conclusions”

See all papers in Proc. ACL 2009 that mention labeled data.

See all papers in Proc. ACL that mention labeled data.

Back to top.

error rate

Appears in 4 sentences as: error rate (4)
In Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm
  1. Table 2: Percentage of positive samples, and averaged error rate for positive (P) and negative (N) samples for the first 20 iterations using the agreement-based and our confidence labeling methods.
    Page 7, “Experiments and results”
  2. Table 2 shows the percentage of the positive samples added for the first 20 iterations, and the average labeling error rate of those samples for the self-labeled positive and negative classes for two methods.
    Page 7, “Experiments and results”
  3. The agreement-based random selection added more negative samples that also have higher error rate than the positive samples.
    Page 7, “Experiments and results”
  4. This difference is caused by the high self-labeling error rate of selected samples.
    Page 7, “Experiments and results”

See all papers in Proc. ACL 2009 that mention error rate.

See all papers in Proc. ACL that mention error rate.

Back to top.

POS tag

Appears in 4 sentences as: POS tag (2) POS tagging (1) POS tags (1)
In Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm
  1. (2007) applied co-training method in POS tagging using agreement-based selection strategy.
    Page 3, “Previous work”
  2. 0 Accent detection: syllable identity, lexical stress (exist or not), word boundary information (boundary or not), and POS tag .
    Page 4, “Prosodic event detection method”
  3. 0 IPB and Break index detection: POS tag , the ratio of syntactic phrases the word initiates, and the ratio of syntactic phrases the word terminates.
    Page 4, “Prosodic event detection method”
  4. As described in Section 4, we use two classifiers for the prosodic event detection task based on two different information sources: one is the acoustic evidence extracted from the speech signal of an utterance; the other is the lexical and syntactic evidence such as syllables, words, POS tags and phrasal boundary information.
    Page 5, “Co-training strategy for prosodic event detection”

See all papers in Proc. ACL 2009 that mention POS tag.

See all papers in Proc. ACL that mention POS tag.

Back to top.

development set

Appears in 3 sentences as: Development Set (1) development set (2)
In Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm
  1. Development Set 20 1,356 2,275 f2b, f3b Labeled set L 5 347 573 m2b, m3b Unlabeled set U 1,027 77,207 129,305 m4b
    Page 6, “Co-training strategy for prosodic event detection”
  2. Among labeled data, 102 utterances of all f] a and m] 19 speakers are used for testing, 20 utterances randomly chosen from f2b, f3b, m2b, m3b, and m4b are used as development set to optimize parameters such as A and confidence level threshold, 5 utterances are used as the initial training set L, and the rest of the data is used as unlabeled set U, which has 1027 unlabeled utterances (we removed the human labels for co-training experiments).
    Page 6, “Experiments and results”
  3. In our experiment, we used some labeled data as development set to estimate some parameters.
    Page 8, “Conclusions”

See all papers in Proc. ACL 2009 that mention development set.

See all papers in Proc. ACL that mention development set.

Back to top.

F-measure

Appears in 3 sentences as: F-measure (3)
In Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm
  1. The F-measure score using the initial training data is 0.69.
    Page 6, “Experiments and results”
  2. Most of the previous work for prosodic event detection reported their results using classification accuracy instead of F-measure .
    Page 8, “Experiments and results”
  3. Table 3: The results ( F-measure ) of prosodic event detection for supervised and co-training approaches.
    Page 8, “Experiments and results”

See all papers in Proc. ACL 2009 that mention F-measure.

See all papers in Proc. ACL that mention F-measure.

Back to top.

neural network

Appears in 3 sentences as: Neural Network (1) neural network (2)
In Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm
  1. (2004) used a Gaussian mixture model for acoustic-prosodic information and neural network based syntactic-prosodic model and achieved pitch accent detection accuracy of 84% and IPB detection accuracy of 90% at the word level.
    Page 3, “Previous work”
  2. The experiments of Ananthakrishnan and Narayanan (2008) with neural network based acoustic-prosodic model and a factored n-gram syntactic model reported 87% accuracy on accent and break index detection at the syllable level.
    Page 3, “Previous work”
  3. Our previous supervised learning approach (Jeon and Liu, 2009) showed that a combined model using Neural Network (NN) classifier for acoustic-prosodic evidence and Support Vector Machine (SVM) classifier for syntactic-prosodic evidence performed better than other classifiers.
    Page 3, “Prosodic event detection method”

See all papers in Proc. ACL 2009 that mention neural network.

See all papers in Proc. ACL that mention neural network.

Back to top.

SVM

Appears in 3 sentences as: SVM (3)
In Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm
  1. She also exploited a semi-supervised approach using Laplacian SVM classification on a small set of examples.
    Page 3, “Previous work”
  2. Our previous supervised learning approach (Jeon and Liu, 2009) showed that a combined model using Neural Network (NN) classifier for acoustic-prosodic evidence and Support Vector Machine ( SVM ) classifier for syntactic-prosodic evidence performed better than other classifiers.
    Page 3, “Prosodic event detection method”
  3. We therefore use NN and SVM in this study.
    Page 3, “Prosodic event detection method”

See all papers in Proc. ACL 2009 that mention SVM.

See all papers in Proc. ACL that mention SVM.

Back to top.