SciSurf: Index of 'Detecting Retries of Voice Search Queries'

Detecting Retries of Voice Search Queries

Levitan, Rivka and Elson, David

Published in Proc. ACL, 2014

Article Structure

Abstract

When a system fails to correctly recognize a voice search query, the user will frequently retry the query, either by repeating it exactly or rephrasing it in an attempt to adapt to the system’s failure.

Introduction

With ever more capable smartphones connecting users to cloud-based computing, voice has been a rapidly growing modality for searching for information online.

Related Work

Previous work in voice-enabled information retrieval has investigated the problem of identifying voice retries, and some has taken the additional step of taking corrective action in instances where the user is thought to be retrying an earlier utterance.

Data and Annotation

Our data consists of pairs of queries sampled from anonymized session logs.

Features

The features we consider can be divided into three main categories.

Prediction task

5.1 Experimental Results

Conclusion

We have presented a method for characterizing retries in an unrestricted voice interface to a search system.

Topics

edit distance

Appears in 4 sentences as: edit distance (5)

In Detecting Retries of Voice Search Queries

We calculate the edit distance between the two transcripts at the character and word level, as well as the two most similar phonetic rewrites.
Page 3, “Features”
Of the similarity features, the ones that contributed significantly in the final model were character edit distance (normalized) and phoneme edit distance (raw and normalized); as expected, retries are associated with more similar query pairs.
Page 4, “Prediction task”
T-tests between the two categories showed that all edit distance features—character, word, reduced, and phonetic; raw and normalized—are significantly more similar between retry query pairs.1 Similarly, the number of unigrams the two queries have in common is significantly higher for retries.
Page 4, “Prediction task”
Most notably, all edit distance features are significantly greater for rephrases.
Page 5, “Prediction task”

See all papers in Proc. ACL 2014 that mention edit distance.

See all papers in Proc. ACL that mention edit distance.

error rate

Appears in 4 sentences as: error rate (4)

In Detecting Retries of Voice Search Queries

In particular, we seek to measure and minimize the word error rate (WER) of a system, with a WER of zero indicating perfect transcription.
Page 1, “Introduction”
We do not have retry annotations for this larger set, but we have transcriptions for the first member of each query pair, enabling us to calculate the word error rate (WER) of each query’s recognition hypothesis, and thus obtain ground truth for half of our retry definition.
Page 5, “Prediction task”
However, our model’s performance today correlates strongly with an orthogonal accuracy metric, word error rate , on unseen data.
Page 5, “Conclusion”
This suggests that “retry rate” is a reasonable offline quality metric, to be considered in context among other metrics and traditional evaluation based on word error rate .
Page 5, “Conclusion”

See all papers in Proc. ACL 2014 that mention error rate.

See all papers in Proc. ACL that mention error rate.

language model

Appears in 4 sentences as: language model (2) language modeling (2)

In Detecting Retries of Voice Search Queries

Retry cases are identified with joint language modeling across multiple transcripts, with the intuition that retry pairs tend to be closely related or exact duplicates.
Page 2, “Related Work”
While we follow this work in our usage of joint language modeling , our application encompasses open domain voice searches and voice actions (such as placing calls), so we cannot use simplifying domain assumptions.
Page 2, “Related Work”
We look at the language model (LM) score and the number of alternate pronunciations of the first query, predicting that a misrecognized query will have a lower LM score and more alternate pronunciations.
Page 4, “Features”
In addition, the language model likelihood for the first query was, as expected, significantly lower for retries.
Page 4, “Prediction task”

See all papers in Proc. ACL 2014 that mention language model.

See all papers in Proc. ACL that mention language model.

unigrams

Appears in 3 sentences as: unigram (1) unigrams (3)

In Detecting Retries of Voice Search Queries

We also count the number of unigrams the two transcripts have in common and the length, absolute and relative, of the longest unigram overlap.
Page 3, “Features”
In addition, we look at the number of characters and unigrams and the audio duration of each query, with the intuition that the length of a query may be correlated with its likelihood of being retried (or a retry).
Page 4, “Features”
T-tests between the two categories showed that all edit distance features—character, word, reduced, and phonetic; raw and normalized—are significantly more similar between retry query pairs.1 Similarly, the number of unigrams the two queries have in common is significantly higher for retries.
Page 4, “Prediction task”

See all papers in Proc. ACL 2014 that mention unigrams.

See all papers in Proc. ACL that mention unigrams.