Branavan, S.R.K. and Zettlemoyer, Luke and Barzilay, Regina
Article Structure
Abstract
In this paper, we address the task of mapping high-level instructions to sequences of commands in an external environment.
Introduction
In this paper, we introduce a novel method for mapping high-level instructions to commands in an external environment.
Related Work
Interpreting Instructions Our approach is most closely related to the reinforcement learning algorithm for mapping text instructions to commands developed by Branavan et al.
Problem Formulation
Our goal is to map instructions expressed in a natural language document d into the corresponding sequence of commands E = (cl, .
Background
Our innovation takes place within a previously established general framework for the task of mapping instructions to commands (Branavan et al., 2009).
Algorithm
We expand the scope of learning approaches for automatic document interpretation by enabling the analysis of high-level instructions.
Applying the Model
We apply our algorithm to the task of interpreting help documents to perform software related tasks (Branavan et al., 2009; Kushman et al., 2009).
Experimental Setup
Datasets Our model is trained on the same dataset used by Branavan et al.
Results
As shown in Table 1, our model outperforms the baseline on the two datasets, according to all evaluation metrics.
Conclusions and Future Work
In this paper, we demonstrate that knowledge about the environment can be learned and used effectively for the task of mapping instructions to actions.
Topics
learning algorithm
Appears in 5 sentences as: learning algorithm (5)
In Reading between the Lines: Learning to Map High-Level Instructions to Commands
- We present an efficient approximate approach for learning this environment model as part of a policy-gradient reinforcement learning algorithm for text interpretation.
Page 1, “Abstract”
- Our method efficiently achieves both of these goals as part of a policy-gradient reinforcement learning algorithm .
Page 1, “Introduction”
- Interpreting Instructions Our approach is most closely related to the reinforcement learning algorithm for mapping text instructions to commands developed by Branavan et al.
Page 2, “Related Work”
- We address this limitation by expanding a policy learning algorithm to take advantage of a partial environment model estimated during learning.
Page 3, “Related Work”
- The learning algorithm is provided with a set of documents d E D, an environment in which to execute command sequences 5’, and a reward function The goal is to estimate two sets of parameters: 1) the parameters 6 of the policy function, and 2) the partial environment transition model q(5’ |5 , c), which is the observed portion of the true model 19(5’ |5, c).
Page 6, “Algorithm”
See all papers in Proc. ACL 2010 that mention learning algorithm.
See all papers in Proc. ACL that mention learning algorithm.
Back to top.
log-linear
Appears in 3 sentences as: Log-Linear (1) log-linear (2)
In Reading between the Lines: Learning to Map High-Level Instructions to Commands
- A Log-Linear Parameterization The policy
Page 4, “Background”
- function used for action selection is defined as a log-linear distribution over actions: €9-¢(s,a)
Page 4, “Background”
- Specifically, we modify the log-linear policy p(a|s; q, 6) by adding lookahead features gb(s, a, q) which complement the local features used in the previous model.
Page 5, “Algorithm”
See all papers in Proc. ACL 2010 that mention log-linear.
See all papers in Proc. ACL that mention log-linear.
Back to top.