Combining Speech Retrieval Results with Generalized Additive Models
Olsson, J. Scott and Oard, Douglas W.

Article Structure

Abstract

Rapid and inexpensive techniques for automatic transcription of speech have the potential to dramatically expand the types of content to which information retrieval techniques can be productively applied, but limitations in accuracy and robustness must be overcome before that promise can be fully realized.

Introduction

Speech retrieval, like other tasks that require transforming the representation of language, suffers from both random and systematic errors that are introduced by the speech-to-text transducer.

Previous Work

One approach for combining ranked retrieval results is to simply linearly combine the multiple system scores for each topic and document.

Generalized Additive Models

Generalized Additive Models (GAMS) are a generalization of Generalized Linear Models (GLMS),

Experiments

4.1 Dataset

Results

Table 1 shows our complete set of results.

Conclusion

While speech retrieval is one example of retrieval under errorful document representations, other similar tasks may also benefit from these combination models.

Topics

logistic regression

Appears in 6 sentences as: Logistic regression (1) logistic regression (5)
In Combining Speech Retrieval Results with Generalized Additive Models
  1. Perhaps the most closely related proposal, using logistic regression , was made first by Savoy et al.
    Page 2, “Previous Work”
  2. Logistic regression is one example from the broad class of models which GAMs encompass.
    Page 2, “Previous Work”
  3. Unlike GAMs in their full generality however, logistic regression imposes a comparatively high degree of linearity in the model structure.
    Page 2, “Previous Work”
  4. A well known GLM in the NLP community is logistic regression (which may alternatively be derived as a maximum entropy classifier).
    Page 3, “Generalized Additive Models”
  5. In logistic regression , the response is assumed to be Binomial and the chosen link function is the logit transformation,
    Page 3, “Generalized Additive Models”
  6. Our first new approach for handling differences in transcripts is an extension of the logistic regression model previously used in data fusion work, (Savoy et al., 1988).
    Page 4, “Generalized Additive Models”

See all papers in Proc. ACL 2008 that mention logistic regression.

See all papers in Proc. ACL that mention logistic regression.

Back to top.

statistically significant

Appears in 6 sentences as: statistical significance (1) statistically significant (5)
In Combining Speech Retrieval Results with Generalized Additive Models
  1. We apply the same reasoning to test for statistical significance in GMAP improvements.
    Page 5, “Experiments”
  2. combination is a statistically significant improvement (04 = 0.05) over our new transcript set (that is, over the best single transcript result).
    Page 7, “Results”
  3. Tests for statistically significant improvements in GMAP are computed using our paired log AP test, as discussed in Section 4.2.2.
    Page 7, “Results”
  4. Secondly, it is the only combination approach able to produce statistically significant relative improvements on both measures for both conditions.
    Page 7, “Results”
  5. One surprising observation from Table 1 is that the mean improvement in log AP for interleaving is fairly large and yet not statistically significant (it is in fact a larger mean improvement than several other baseline combination approaches which are significant improvements.
    Page 7, “Results”
  6. While the improvement with MD-GAM is now not statistically significant (primarily because of our small query set), we found it still outperformed the oracle linear combination.
    Page 8, “Results”

See all papers in Proc. ACL 2008 that mention statistically significant.

See all papers in Proc. ACL that mention statistically significant.

Back to top.

language model

Appears in 5 sentences as: language model (3) language modeling (2)
In Combining Speech Retrieval Results with Generalized Additive Models
  1. Limitations in signal processing, acoustic modeling, pronunciation, vocabulary, and language modeling can be accommodated in several ways, each of which make different tradeoffs and thus induce different
    Page 1, “Introduction”
  2. In the extreme case, the term may simply be out of vocabulary, although this may occur for various other reasons (e. g., poor language modeling or pronunciation dictionaries).
    Page 3, “Previous Work”
  3. The resulting interview transcripts have a reported mean word error rate (WER) of approximately 25% on held out data, which was obtained by priming the language model with meta-data available from preinterview questionnaires.
    Page 6, “Experiments”
  4. We use a mixture of the training transcripts and various newswire sources for our language model training.
    Page 6, “Experiments”
  5. We did not attempt to prime the language model for particular interviewees or otherwise utilize any interview metadata.
    Page 6, “Experiments”

See all papers in Proc. ACL 2008 that mention language model.

See all papers in Proc. ACL that mention language model.

Back to top.

significant improvements

Appears in 5 sentences as: significant improvement (1) significant improvements (4)
In Combining Speech Retrieval Results with Generalized Additive Models
  1. That is, we test for significant improvements in GMAP by applying the Wilcoxon signed rank test to the paired, transformed average precisions, log AP.
    Page 5, “Experiments”
  2. This represents significant improvements over IBM transcripts used in earlier CL-SR evaluations, which had a best reported WER of 39.6% (Byrne et al., 2004).
    Page 6, “Experiments”
  3. combination is a statistically significant improvement (04 = 0.05) over our new transcript set (that is, over the best single transcript result).
    Page 7, “Results”
  4. Tests for statistically significant improvements in GMAP are computed using our paired log AP test, as discussed in Section 4.2.2.
    Page 7, “Results”
  5. One surprising observation from Table 1 is that the mean improvement in log AP for interleaving is fairly large and yet not statistically significant (it is in fact a larger mean improvement than several other baseline combination approaches which are significant improvements .
    Page 7, “Results”

See all papers in Proc. ACL 2008 that mention significant improvements.

See all papers in Proc. ACL that mention significant improvements.

Back to top.