Index of papers in Proc. ACL that mention
  • shift-reduce
Hayashi, Katsuhiko and Watanabe, Taro and Asahara, Masayuki and Matsumoto, Yuji
Abstract
This algorithm handles global structures, such as clause and coordination, better than shift-reduce or other bottom-up algorithms.
Experiments
length, comparing with top-down, 2nd-MST and shift-reduce parsers (beam size: 8, pred size: 5)
Experiments
We also implemented and experimented Huang and Sagae (2010)’s arc-standard shift-reduce parser.
Experiments
We used an early update version of averaged perceptron algorithm (Collins and Roark, 2004) for training of shift-reduce and top-down parsers.
Introduction
Transition-based parsing algorithms, such as shift-reduce algorithms (Nivre, 2004; Zhang and Clark, 2008), are widely used for dependency analysis because of the efficiency and comparatively good performance.
Introduction
(2004) pointed out that the drawbacks of shift-reduce parser could be resolved by incorporating top-down information such as root finding.
Introduction
This work presents an 0(n2) top-down head-driven transition-based parsing algorithm which can parse complex structures that are not trivial for shift-reduce parsers.
Related Work
Yamada and Matsumoto (2003) applied a shift-reduce algorithm to dependency analysis, which is known as arc-standard transition-based algorithm (Nivre, 2004).
shift-reduce is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Kolomiyets, Oleksandr and Bethard, Steven and Moens, Marie-Francine
Evaluations
Table 2: Features for the shift-reduce parser (SRP) and the graph-based maximum spanning tree (MST) parser.
Evaluations
The Shift-Reduce parser (SRP; Section 4.1) and the graph-based, maximum spanning tree parser (MST; Section 4.2) are compared to these baselines.
Evaluations
In terms of labeled attachment score, both dependency parsing models outperformed the baseline models — the maximum spanning tree parser achieved 0.614 LAS, and the shift-reduce parser achieved 0.647 LAS.
Feature Design
The shift-reduce parser (SRP) trains a machine learning classifier as the oracle 0 E (C —> T) to predict a transition 75 from a parser configuration 0 2 (L1, L2, Q, E), using node features such as the heads of L1, L2 and Q, and edge features from the already predicted temporal relations in E. The graph-based maximum spanning tree (MST) parser trains a machine learning model to predict SCORE(e) for an edge e = (107;, rj, wk), using features of the nodes w, and wk.
Feature Design
Note that both models share features that look at the nodes, while only the shift-reduce parser has features for previously classified edges.
Parsing Models
We consider two different approaches to learning a temporal dependency parser: a shift-reduce model (Nivre, 2008) and a graph-based model (McDonald et al., 2005).
Parsing Models
4.1 Shift-Reduce Parsing Model
Parsing Models
Shift-reduce dependency parsers start with an input queue of unlinked words, and link them into a tree by repeatedly choosing and performing actions like shifting a node to a stack, or popping two nodes from the stack and linking them.
shift-reduce is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Liu, Yang
Abstract
We introduce a shift-reduce parsing algorithm for phrase-based string-to-dependency translation.
Abstract
To resolve conflicts in shift-reduce parsing, we propose a maximum entropy model trained on the derivation graph of training data.
Introduction
In this paper, we propose a shift-reduce parsing algorithm for phrase-based string-to-dependency translation.
Introduction
4. exploiting syntactic information: as the shift-reduce parsing algorithm generates target language dependency trees in decoding, dependency language models (Shen et al., 2008; Shen et al., 2010) can be used to encourage linguistically-motivated reordering.
Introduction
2 Shift-Reduce Parsing for Phrase-based String-to-Dependency Translation
shift-reduce is mentioned in 27 sentences in this paper.
Topics mentioned in this paper:
Zhu, Muhua and Zhang, Yue and Chen, Wenliang and Zhang, Min and Zhu, Jingbo
Abstract
Shift-reduce dependency parsers give comparable accuracies to their chart-based counterparts, yet the best shift-reduce constituent parsers still lag behind the state-of-the-art.
Abstract
One important reason is the existence of unary nodes in phrase structure trees, which leads to different numbers of shift-reduce actions between different outputs for the same input.
Abstract
We propose a simple yet effective extension to the shift-reduce process, which eliminates size differences between action sequences in beam-search.
Baseline parser
We adopt the parser of Zhang and Clark (2009) for our baseline, which is based on the shift-reduce process of Sagae and Lavie (2005), and employs global perceptron training and beam search.
Baseline parser
2.1 Vanilla Shift-Reduce
Baseline parser
Shift-reduce parsing is based on a left-to-right scan of the input sentence.
Introduction
Transition-based parsers employ a set of shift-reduce actions and perform parsing using a sequence of state transitions.
Introduction
We propose an extension to the shift-reduce process to address this problem, which gives significant improvements to the parsing accuracies.
Semi-supervised Parsing with Large Data
This section discusses how to extract information from unlabeled data or auto-parsed data to further improve shift-reduce parsing accuracies.
Semi-supervised Parsing with Large Data
Based on the information, we propose a set of novel features specifically designed for shift-reduce constituent parsing.
shift-reduce is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Ji, Yangfeng and Eisenstein, Jacob
Abstract
The resulting shift-reduce discourse parser obtains substantial improvements over the previous state-of-the-art in predicting relations and nuclearity on the RST Treebank.
Introduction
Our method is implemented as a shift-reduce discourse parser (Marcu, 1999; Sagae, 2009).
Model
2.1 Shift-reduce discourse parsing
Model
We construct RST Trees using shift-reduce parsing, as first proposed by Marcu (1999).
Model
C Total number of classes, which correspond to possible shift-reduce operations
shift-reduce is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Xu, Wenduan and Clark, Stephen and Zhang, Yue
Abstract
This paper presents the first dependency model for a shift-reduce CCG parser.
Introduction
In this paper, we fill a gap in the literature by developing the first dependency model for a shift-reduce CCG parser.
Introduction
Shift-reduce parsing applies naturally to CCG (Zhang and Clark, 2011), and the left-to-right, incremental nature of the decoding fits with CCG’s cognitive claims.
Introduction
Results on the standard CCGBank tests show that our parser achieves absolute labeled F-score gains of up to 0.5 over the shift-reduce parser of Zhang and Clark (2011); and up to 1.05 and 0.64 over the normal-form and hybrid models of Clark and Curran (2007), respectively.
Shift-Reduce with Beam-Search
This section describes how shift-reduce techniques can be applied to CCG, following Zhang and Clark (2011).
Shift-Reduce with Beam-Search
First we describe the deterministic process which a parser would follow when tracing out a single, correct derivation; then we describe how a model of normal-form derivations — or, more accurately, a sequence of shift-reduce actions leading to a normal-form derivation —can be used with beam-search to develop a nondeterministic parser which selects the highest scoring sequence of actions.
Shift-Reduce with Beam-Search
Note this section only describes a normal-form derivation model for shift-reduce parsing.
shift-reduce is mentioned in 21 sentences in this paper.
Topics mentioned in this paper:
Zhao, Hai and Song, Yan and Kit, Chunyu and Zhou, Guodong
Dependency Parsing: Baseline
In detail, a shift-reduce method is adopted as in (Nivre, 2003), where a classifier is used to make a parsing decision step by step.
Dependency Parsing: Baseline
While memory-based and margin-based learning approaches such as support vector machines are popularly applied to shift-reduce parsing, we apply maximum entropy model as the learning model for efficient training and adopting overlapped features as our work in (Zhao and Kit, 2008), especially, those character-level ones for Chinese parsing.
Dependency Parsing: Baseline
Without this type of features, a shift-reduce parser may directly scan through an input sequence in linear time.
Evaluation Results
beam search algorithm with width 5 is used for parsing, otherwise, a simple shift-reduce decoding is used.
shift-reduce is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Rastrow, Ariya and Dredze, Mark and Khudanpur, Sanjeev
Incorporating Syntactic Structures
A similar idea of equivalent states is used by Huang and Sagae (2010), except they use equivalence to facilitate dynamic programming for shift-reduce parsing, whereas we generalize it for improving the processing time of similar hypotheses in general models.
Incorporating Syntactic Structures
We use the best-first probabilistic shift-reduce dependency parser of Sagae and Tsujii (2007), a transition-based parser (Kubler et al., 2009) with a MaXEnt classifier.
Incorporating Syntactic Structures
Algorithm 1 Best—first shift-reduce dependency parsing
Related Work
While Huang and Sagae (2010) use the notion of “equivalent states”, they do so for dynamic programming in a shift-reduce parser to broaden the search space.
shift-reduce is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting
Introduction
With regard to task of parsing itself, an important advantage of the character-level syntax trees is that they allow word segmentation, part-of-speech (POS) tagging and parsing to be performed jointly, using an efficient CKY-style or shift-reduce algorithm.
Introduction
Our model is based on the discriminative shift-reduce parser of Zhang and Clark (2009; 2011), which is a state-of-the-art word-based phrase-structure parser for Chinese.
Introduction
We extend their shift-reduce framework, adding more transition actions for word segmentation and POS tagging, and defining novel features that capture character information.
Related Work
Our work is based on the shift-reduce operations of their work, while we introduce additional operations for segmentation and POS tagging.
shift-reduce is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Setiawan, Hendra and Zhou, Bowen and Xiang, Bing and Shen, Libin
Decoding
The algorithm bears a close resemblance to the shift-reduce algorithm where a stack is used to accumulate (partial) information about a, M L and M R for each a E A in the derivation.
Introduction
We implement an efficient shift-reduce algorithm that facilitates the accumulation of partial context in a bottom-up fashion, allowing our model to influence the translation process even in the absence of full context.
Introduction
In Section 6, we describe our shift-reduce algorithm which inte-
shift-reduce is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: