Discovering Latent Structure in Task-Oriented Dialogues

A key challenge for computational conversation models is to discover latent structure in task-oriented dialogue, since it provides a basis for analysing, evaluating, and building conversational systems.

Modeling human conversation is a fundamental scientific pursuit.

To test the generality of our models, we study two very different datasets: a set of human-computer spoken dialogues in quering bus timetable (BusTime), and a set of human-human text-based dialogues in the technical support domain (TechSap-port).

In this work, our goal is to infer latent structure presented in task-oriented conversation.

In this section, we examine the effectiveness of our models.

We have presented three new unsupervised models to discover latent structures in task-oriented dialogues.

Appears in 7 sentences as: topic distribution (7)

In *Discovering Latent Structure in Task-Oriented Dialogues*

- where a, ,6, *7 are symmetric Dirichlet priors on state-wise topic distribution 675’s, topic-wise word distribution qbt’s and state transition multinomials, respectively.Page 4, “Latent Structure in Dialogues”
- Therefore, the topic distribution is often stable throughout the entire dialogue, and does not vary from turn to turn.Page 4, “Latent Structure in Dialogues”
- To express this, words in the TM—HMMS model are generated either from a dialogue-specific topic distribution , or from a state-specific language model.4 A distribution over sources is sampled once at the beginning of each dialogue and selects the expected fraction of words generated from different sources.Page 4, “Latent Structure in Dialogues”
- 3: For each word in utterance n, first choose a word source 7“ according to 1', and then depending on 7“, generate a word 21) either from the session-wide topic distribution 6 or the language model specified by the state 37,.Page 4, “Latent Structure in Dialogues”
- 1: For each session, draw a topic distribution 6.Page 5, “Latent Structure in Dialogues”
- 3: For each word in utterance n, first sample a word source 7“ according to 1'8”, and then depending on 7“, generate a word 21) either from the session-wide topic distribution 6 or the language model specified by the state 37,.Page 5, “Latent Structure in Dialogues”
- However, for TM-HMMS and TM—HMMSS models, the latent topic distribution 6 creates local dependencies, rendering computation of marginal likeli-Page 5, “Latent Structure in Dialogues”

See all papers in *Proc. ACL 2014* that mention topic distribution.

See all papers in *Proc. ACL* that mention topic distribution.

Back to top.

Appears in 6 sentences as: language model (4) language models (2)

In *Discovering Latent Structure in Task-Oriented Dialogues*

- The simplest formulation we consider is an HMM where each state contains a unigram language model (LM), proposed by Chotimongkol (2008) for task-oriented dialogue and originallyPage 3, “Latent Structure in Dialogues”
- 3: For each word in utterance n, first choose a word source 7“ according to 1', and then depending on 7“, generate a word 21) either from the session-wide topic distribution 6 or the language model specified by the state 37,.Page 4, “Latent Structure in Dialogues”
- 4Note that a TM-HMMS model with state-specific topic models (instead of state-specific language models ) would be subsumed by TM—HMM, since one topic could be used as the background topic in TM -HMMS.Page 4, “Latent Structure in Dialogues”
- 3: For each word in utterance n, first sample a word source 7“ according to 1'8”, and then depending on 7“, generate a word 21) either from the session-wide topic distribution 6 or the language model specified by the state 37,.Page 5, “Latent Structure in Dialogues”
- Illustrated by the highlighted states in 6, LM—HMM model conflates interactions that commonly occur at the beginning and end of a dialogue—i.e., “acknowledge agent” and “resolve problem”, since their underlying language models are likely to produce similar probability distributions over words.Page 6, “Experiments”
- By incorporating topic information, our proposed models (e.g., TM—HMMSS in Figure 5) are able to enforce the state transitions towards more frequent flow patterns, which further helps to overcome the weakness of language model .Page 6, “Experiments”

See all papers in *Proc. ACL 2014* that mention language model.

See all papers in *Proc. ACL* that mention language model.

Back to top.

Appears in 6 sentences as: LDA (6)

In *Discovering Latent Structure in Task-Oriented Dialogues*

- In this paper, we retain the underlying HMM, but assume words are emitted using topic models (TM), exemplified by latent Dirichlet allocation (Blei et al., 2003, LDA ).Page 2, “Introduction”
- LDA assumes each word in an utterance is drawn from one of a set of latent topics, where each topic is a multinomial distribution over the vocabulary.Page 2, “Introduction”
- This paper is organized as follows: Section 2 introduces two task-oriented domains and corpora; Section 3 details three new unsupervised generative models which combine HMMs and LDA and efficient inference schemes; Section 4 evaluates our models qualitatively and quantitatively, and finally conclude in Section 5.Page 2, “Introduction”
- We assume 6’s and qb’s are drawn from corresponding Dirichlet priors, as in LDA .Page 4, “Latent Structure in Dialogues”
- All probabilities can be computed using collapsed Gibbs sampler for LDA (GriffithsPage 4, “Latent Structure in Dialogues”
- Again, we impose Dirichlet priors on distributions over topics 6’s and distributions over words qb’s as in LDA .Page 5, “Latent Structure in Dialogues”

See all papers in *Proc. ACL 2014* that mention LDA.

See all papers in *Proc. ACL* that mention LDA.

Back to top.

Appears in 5 sentences as: LM (5)

In *Discovering Latent Structure in Task-Oriented Dialogues*

- The simplest formulation we consider is an HMM where each state contains a unigram language model ( LM ), proposed by Chotimongkol (2008) for task-oriented dialogue and originallyPage 3, “Latent Structure in Dialogues”
- ,men are generated (independently) according to the LM .Page 3, “Latent Structure in Dialogues”
- (2010) extends LM—HMM to allow words to be emitted from two additional sources: the topic of current dialogue qb, or a background LM a shared across all dialogues.Page 3, “Latent Structure in Dialogues”
- In other words, instead of generating words via a LM , we generate words from a topic model (TM), where each state maps to a mixture of topics.Page 3, “Latent Structure in Dialogues”
- TM—HMMS (Figure 2(b)) extends TM—HMM to allow words to be generated either from state LM (as in LM—HMM), or a set of dialogue topics (akin to LM—HMMS).Page 4, “Latent Structure in Dialogues”

See all papers in *Proc. ACL 2014* that mention LM.

See all papers in *Proc. ACL* that mention LM.

Back to top.

Appears in 4 sentences as: topic model (1) topic models (3)

In *Discovering Latent Structure in Task-Oriented Dialogues*

- Our methods synthesize hidden Markov models (for underlying state) and topic models (to connect words to states).Page 1, “Abstract”
- In this paper, we retain the underlying HMM, but assume words are emitted using topic models (TM), exemplified by latent Dirichlet allocation (Blei et al., 2003, LDA).Page 2, “Introduction”
- In other words, instead of generating words via a LM, we generate words from a topic model (TM), where each state maps to a mixture of topics.Page 3, “Latent Structure in Dialogues”
- 4Note that a TM-HMMS model with state-specific topic models (instead of state-specific language models) would be subsumed by TM—HMM, since one topic could be used as the background topic in TM -HMMS.Page 4, “Latent Structure in Dialogues”

See all papers in *Proc. ACL 2014* that mention topic models.

See all papers in *Proc. ACL* that mention topic models.

Back to top.

Appears in 3 sentences as: Gibbs sampler (1) Gibbs samplers (1) Gibbs sampling (1)

In *Discovering Latent Structure in Task-Oriented Dialogues*

- We also assume symmetric Dirichlet priors on all multinomial distributions and apply collapsed Gibbs sampling .Page 3, “Latent Structure in Dialogues”
- All probabilities can be computed using collapsed Gibbs sampler for LDA (GriffithsPage 4, “Latent Structure in Dialogues”
- We run the Gibbs samplers for 1000 iterations and update all hyper-parameters using slice sampling (Neal, 2003; Wallach, 2008) every 10 iterations.Page 6, “Experiments”

See all papers in *Proc. ACL 2014* that mention Gibbs sampler.

See all papers in *Proc. ACL* that mention Gibbs sampler.

Back to top.

Appears in 3 sentences as: named entities (3)

In *Discovering Latent Structure in Task-Oriented Dialogues*

- We also map all named entities (e.g., “downtown” and “28X”) to their semantic types (resp.Page 2, “Data”
- We map named entities to their semantic types, apply stemming, and remove stop words.3 The corpus we use contains approximately 2, 000 dialogue sessions or 80, 000 conversation utterances.Page 3, “Data”
- 3We used regular expression to map named entities , and Porter stemmer in NLTK to stem all tokens.Page 3, “Latent Structure in Dialogues”

See all papers in *Proc. ACL 2014* that mention named entities.

See all papers in *Proc. ACL* that mention named entities.

Back to top.