Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation
Rieser, Verena and Lemon, Oliver

Article Structure

Abstract

We address two problems in the field of automatic optimization of dialogue strategies: learning effective dialogue strategies when no initial data or system exists, and evaluating the result with real users.

Introduction

Designing a spoken dialogue system is a time-consuming and challenging task.

Wizard-of-Oz data collection

Our domains of interest are information-seeking dialogues, for example a multimodal in-car interface to a large database of music (MP3) files.

Simulated Learning Environment

Simulation-based RL (also know as “model-free” RL) learns by interaction with a simulated environment.

User Tests

4.1 Experimental design

Comparison of Results

We finally test whether the results obtained in simulation transfer to tests with real users, following (Lemon et al., 2006a).

Conclusion

We addressed two problems in the field of automatic optimization of dialogue strategies: learning effective dialogue strategies when no initial data or system exists, and evaluating the result with real users.

Topics

significantly improved

Appears in 4 sentences as: significant improvement (1) significantly improved (3)
In Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation
  1. Table 1: Predicted accuracy for presentation timing and modality (with standard deviation i), * denotes statistically significant improvement at p < .05
    Page 4, “Simulated Learning Environment”
  2. For presentation timing, none of the classifiers produces significantly improved results.
    Page 4, “Simulated Learning Environment”
  3. data show significantly improved Task Ease, better presentation timing, more agreeable verbal and multimodal presentation, and that more users would use the RL-based system in the future (Future Use).
    Page 7, “User Tests”
  4. The results show that only the RL strategy leads to significantly improved user ratings (increasing average Task Ease by 49% and Future Use by 19%), whereas the ratings for the SL policy are not significantly better than those for the WOZ data, see Table 3.
    Page 8, “Comparison of Results”

See all papers in Proc. ACL 2008 that mention significantly improved.

See all papers in Proc. ACL that mention significantly improved.

Back to top.

significantly outperform

Appears in 4 sentences as: significantly outperform (2) significantly outperforms (2)
In Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation
  1. Our results show that RL significantly outperforms Supervised Learning when interacting in simulation as well as for interactions with real users.
    Page 1, “Abstract”
  2. For learning presentation modality, both classifiers significantly outperform the baseline.
    Page 4, “Simulated Learning Environment”
  3. The results show that simulation-based RL with an environment bootstrapped from WOZ data allows learning of robust strategies which significantly outperform the strategies contained in the initial data set.
    Page 6, “Simulated Learning Environment”
  4. Our results show that RL significantly outperforms SL in simulation as well as in interactions with real users.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2008 that mention significantly outperform.

See all papers in Proc. ACL that mention significantly outperform.

Back to top.