Linefeed Insertion into Japanese Spoken Monologue for Captioning
Ohno, Tomohiro and Murata, Masaki and Matsubara, Shigeki

Article Structure

Abstract

To support the real-time understanding of spoken monologue such as lectures and commentaries, the development of a captioning system is required.

Introduction

Real-time captioning is a technique for supporting the speech understanding of deaf persons, elderly persons, or foreigners by displaying transcribed texts of monologue speech such as lectures.

Linefeed Insertion for Spoken Monologue

In our research, in an environment in which captions are displayed on the site of a lecture, we assume that a screen for displaying only captions is used.

Preliminary Analysis about Linefeed Points

In our research, the points into which linefeeds should be inserted is detected by using machine learning.

Linefeed Insertion Technique

In our method, a sentence, on which morphological analysis, bunsetsu segmentation, clause boundary analysis and dependency analysis are performed, is considered the input.

Experiment

To evaluate the effectiveness of our method, we conducted an experiment on inserting linefeeds by using discourse speech data.

Discussion

In this section, we discuss the experimental results described in Section 5 to verify the effectiveness of our method in more detail.

Conclusion

This paper proposed a method for inserting linefeeds into discourse speech data.

Topics

F-measure

Appears in 7 sentences as: F-measure (7)
In Linefeed Insertion into Japanese Spoken Monologue for Captioning
  1. recall (%) precision (%) F-measure
    Page 6, “Experiment”
  2. On the other hand, the F-measure and the sentence accuracy of our method were 81.43 and 53.15%, respectively.
    Page 6, “Experiment”
  3. Here, we compared our method with the baseline 3, of which F-measure was highest among four baselines described in Section 5.1.
    Page 7, “Discussion”
  4. recall (%) precision (%) F-measure by human 89.82 (459/511) 89.82 (459/511) 89.82 our method 82.19 (420/511) 81.71 (420/514) 81.95
    Page 8, “Discussion”
  5. In F-measure , our method achieved 91.24% (8 1 .95/ 89.82) of the result by the human annotator.
    Page 8, “Discussion”
  6. recall (%) precision (%) F-measure
    Page 8, “Discussion”
  7. However, the F-measure of our method was more than 10% higher than those of four baselines.
    Page 8, “Discussion”

See all papers in Proc. ACL 2009 that mention F-measure.

See all papers in Proc. ACL that mention F-measure.

Back to top.

dependency relation

Appears in 6 sentences as: dependency relation (5) dependency relations (1)
In Linefeed Insertion into Japanese Spoken Monologue for Captioning
  1. In the analysis, we focused on the clause boundary, dependency relation , line length, pause and morpheme of line head, and investigated the relations between them and linefeed points.
    Page 3, “Preliminary Analysis about Linefeed Points”
  2. Next, we focused on the type of the dependency relation , by which the likelihood of linefeed insertion is different.
    Page 3, “Preliminary Analysis about Linefeed Points”
  3. onl in which ofthe a to get a [Old][dor¥1estic cars][are covered][magazinejlwriter] [my] [car] [Story about] [ask] : bu nsetsu —> : dependency relation /
    Page 4, “Preliminary Analysis about Linefeed Points”
  4. Linefeeds are inserted between adjacent bunsetsus which do not depend on each other (Linefeed insertion based on dependency relations ).
    Page 6, “Experiment”
  5. This is because, in the correct data, linefeeds were hardly inserted between two neighboring bunsetsus which are in a dependency relation .
    Page 6, “Experiment”
  6. However, the precision was low, because, in the baseline 3, linefeeds are invariably inserted between two neighboring bunsetsus which are not in a dependency relation .
    Page 6, “Experiment”

See all papers in Proc. ACL 2009 that mention dependency relation.

See all papers in Proc. ACL that mention dependency relation.

Back to top.

machine learning

Appears in 4 sentences as: machine learning (4)
In Linefeed Insertion into Japanese Spoken Monologue for Captioning
  1. Our method appropriately inserts linefeeds into a sentence by machine learning , based on the information such as dependencies, clause boundaries, pauses and line length.
    Page 1, “Abstract”
  2. In our method, the linefeeds are inserted into only the boundaries between bunset-susl, and the linefeeds are appropriately inserted into a sentence by machine learning , based on the information such as morphemes, dependencies2, clause boundaries, pauses and line length.
    Page 1, “Introduction”
  3. In our research, the points into which linefeeds should be inserted is detected by using machine learning .
    Page 3, “Preliminary Analysis about Linefeed Points”
  4. Our method can insert linefeeds so that captions become easy to read, by using machine learning techniques on features such as morphemes, dependencies, clause boundaries, pauses and line length.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2009 that mention machine learning.

See all papers in Proc. ACL that mention machine learning.

Back to top.

maximum entropy

Appears in 4 sentences as: Maximum Entropy (1) maximum entropy (3)
In Linefeed Insertion into Japanese Spoken Monologue for Captioning
  1. These probabilities are estimated by the maximum entropy method.
    Page 5, “Linefeed Insertion Technique”
  2. 4.2 Features on Maximum Entropy Method
    Page 5, “Linefeed Insertion Technique”
  3. Here, we used the maximum entropy method tool (Zhang, 2008) with the default options except “-i 2000.”
    Page 6, “Experiment”
  4. In the experiment described in Section 5, we used the linguistic information provided by human as the features on the maximum entropy method.
    Page 8, “Discussion”

See all papers in Proc. ACL 2009 that mention maximum entropy.

See all papers in Proc. ACL that mention maximum entropy.

Back to top.

part-of-speech

Appears in 4 sentences as: Part-of-speech (1) part-of-speech (4)
In Linefeed Insertion into Japanese Spoken Monologue for Captioning
  1. Here, we focused on the basic form and part-of-speech of a morpheme.
    Page 4, “Preliminary Analysis about Linefeed Points”
  2. 0 Part-of-speech : noun-non_independent-general [0/40], noun-nai_adj ective_stem [0/40], noun-non_independent-adverbial [(0/27]
    Page 4, “Preliminary Analysis about Linefeed Points”
  3. o the rightmost independent morpheme (a part-of-speech, an inflected form) and rightmost morpheme (a part-of-speech ) of a bunsetsu bi
    Page 5, “Linefeed Insertion Technique”
  4. 0 whether or not the basic form or part-of-speech of the leftmost morpheme of the next bunsetsu of bi is one of the morphemes enumerated in Section 3.5.
    Page 5, “Linefeed Insertion Technique”

See all papers in Proc. ACL 2009 that mention part-of-speech.

See all papers in Proc. ACL that mention part-of-speech.

Back to top.

morphological analysis

Appears in 3 sentences as: morphological analysis (3)
In Linefeed Insertion into Japanese Spoken Monologue for Captioning
  1. The data is annotated by hand with information on morphological analysis , bunsetsu segmentation, dependency analysis, clause boundary detection, and linefeeds insertion.
    Page 3, “Preliminary Analysis about Linefeed Points”
  2. In our method, a sentence, on which morphological analysis , bunsetsu segmentation, clause boundary analysis and dependency analysis are performed, is considered the input.
    Page 4, “Linefeed Insertion Technique”
  3. All the data are annotated with information on morphological analysis , clause boundary detection and dependency analysis by hand.
    Page 5, “Experiment”

See all papers in Proc. ACL 2009 that mention morphological analysis.

See all papers in Proc. ACL that mention morphological analysis.

Back to top.