We use single-agent and multi-agent Reinforcement Learning (RL) for learning dialogue policies in a resource allocation negotiation scenario.
The dialogue policy of a dialogue system decides on which actions the system should perform given a particular dialogue state (i.e., dialogue context).
Most research in RL for dialogue management has been done in the framework of slot-filling applications such as restaurant recommendations (Lemon et a1., 2006; Thomson and Young, 2010; Gasic et al., 2012; Daubigney et a1., 2012), flight reservations (Henderson et a1., 2008), sightseeing recommendations (Misu et al., 2010), appointment scheduling (Georgila et al., 2010), etc.
Reinforcement Learning (RL) is a machine learning technique used to learn the policy of an agent, i.e., which action the agent should perform given its current state (Sutton and Barto, 1998).
Our domain is a resource allocation negotiation scenario.