SciSurf: Index of "learning algorithm" in Proc. ACL 2014

Index of papers in Proc. ACL 2014 that mention

learning algorithm

Seen in text as:

learning algorithm (50)
learning algorithms (22)
Learning Algorithm (4)

Seen in 76 sentences in 13 papers.

1. Learning Structured Perceptrons for Coreference Resolution with Latent Antecedents and Non-local Features

Björkelund, Anders and Kuhn, Jonas

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introducing Nonlocal Features	In other words, it is unlikely that we can devise a feature set that is informative enough to allow the weight vector to converge towards a solution that lets the learning algorithm see the entire documents during training, at least in the situation when no external knowledge sources are used.
Introducing Nonlocal Features	Thus the learning algorithm always reaches the end of a document, avoiding the problem that early updates discard parts of the training data.
Introducing Nonlocal Features	When we applied LaSO, we noticed that it performed worse than the baseline learning algorithm when only using local features.
Introduction	The main reason why early updates underper-form in our setting is that the task is too difficult and that the learning algorithm is not able to profit from all training data.
Introduction	Put another way, early updates happen too early, and the learning algorithm rarely reaches the end of the instances as it halts, updates, and moves on to the next instance.
Representation and Learning	Algorithm 1 shows pseudocode for the leam-ing algorithm, which we will refer to as the baseline learning algorithm .
Results	Since early updates do not always make use of the complete documents during training, it can be expected that it will require either a very wide beam or more iterations to get up to par with the baseline learning algorithm .
Results	Recall that with only local features, delayed LaSO is equivalent to the baseline learning algorithm .
Results	From these results we conclude that we are better off when the learning algorithm handles one document at a time, instead of getting feedback within documents.

learning algorithm is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

2. Online Learning in Tensor Space

Cao, Yuan and Khudanpur, Sanjeev

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We propose an online learning algorithm based on tensor-space models.
Abstract	We apply with the proposed algorithm to a parsing task, and show that even with very little training data the learning algorithm based on a tensor model performs well, and gives significantly better results than standard learning algorithms based on traditional vector-space models.
Conclusion and Future Work	In this paper, we reformulated the traditional linear vector-space models as tensor-space models, and proposed an online learning algorithm named Tensor-MIRA.
Introduction	Many learning algorithms applied to NLP problems, such as the Perceptron (Collins,
Introduction	A tensor weight learning algorithm is then proposed in 4.
Online Learning Algorithm	Here we propose an online learning algorithm similar to MIRA but modified to accommodate tensor models.
Online Learning Algorithm	,m, where 95,- is the input and y,- is the reference or oracle hypothesis, are fed to the weight learning algorithm in sequential order.
Tensor Model Construction	As a way out, we first run a simple vector-model based learning algorithm (say the Perceptron) on the training data and estimate a weight vector, which serves as a “surro-
Tensor Space Representation	Most of the learning algorithms for NLP problems are based on vector space models, which represent data as vectors qb E R”, and try to learn feature weight vectors w E R” such that a linear model 3/ = w - qb is able to discriminate between, say, good and bad hypotheses.

learning algorithm is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

3. A Provably Correct Learning Algorithm for Latent-Variable PCFGs

Cohen, Shay B. and Collins, Michael

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We introduce a provably correct learning algorithm for latent-variable PCFGs.
Additional Details of the Algorithm	The learning algorithm for L-PCFGs can be useI as an initializer for the EM algorithm for L PCFGs.
Experiments on Parsing	This section describes parsing experiments using the learning algorithm for L-PCFGs.
Experiments on Parsing	Table 1: Results on the development data (section 22) and test data (section 23) for various learning algorithms for L-PCFGs.
Experiments on Parsing	In this special case, the L-PCFG learning algorithm is equivalent to a simple algorithm, with the following steps: 1) define the matrix Q with entries wam = count(w1,w2)/N where count(w1,w2) is the number of times that bi-gram (101,102) is seen in the data, and N = wam count(w1, 7.02).
The Learning Algorithm for L-PCFGS	Our goal is to design a learning algorithm for L-PCFGs.
The Learning Algorithm for L-PCFGS	4.2 The Learning Algorithm
The Learning Algorithm for L-PCFGS	Figure 1 shows the learning algorithm for L-PCFGs.
The Matrix Decomposition Algorithm	This section describes the matrix decomposition algorithm used in Step 1 of the learning algorithm .

learning algorithm is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

4. Spectral Unsupervised Parsing with Additive Tree Metrics

Parikh, Ankur P. and Cohen, Shay B. and Xing, Eric P.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Additive tree metrics can be leveraged by “meta-algorithms” such as neighbor-joining (Saitou and Nei, 1987) and recursive grouping (Choi et al., 2011) to provide consistent learning algorithms for latent trees.
Abstract	In our learning algorithm , we assume that examples of the form (1002,3302) for i E [N] = {1, .
Abstract	The word embeddings are used during the leam-ing process, but the final decoder that the learning algorithm outputs maps a POS tag sequence a: to a parse tree.

learning algorithm is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

5. Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification

Tang, Duyu and Wei, Furu and Yang, Nan and Zhou, Ming and Liu, Ting and Qin, Bing

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	(2002) and employ machine learning algorithms to build classifiers from tweets with manually annotated sentiment polarity.
Introduction	To this end, we extend the existing word embedding learning algorithm (Collobert et al., 2011) and develop three neural networks to effectively incorporate the supervision from sentiment polarity of text (e.g.
Introduction	In the accuracy of polarity consistency between each sentiment word and its top N closest words, SSWE outperforms existing word embedding learning algorithms .
Related Work	Under this assumption, many feature learning algorithms are proposed to obtain better classification performance (Pang and Lee, 2008; Liu, 2012; Feldman, 2013).
Related Work	We extend the existing word embedding learning algorithm (Collobert et al., 2011) and develop three neural networks to learn SSWE.
Related Work	In the following sections, we introduce the traditional method before presenting the details of SSWE learning algorithms .

learning algorithm is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

6. Automatic Detection of Cognates Using Orthographic Alignment

Ciobanu, Alina Maria and Dinu, Liviu P.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We use aligned subsequences as features for machine learning algorithms in order to infer rules for linguistic changes undergone by words when entering new languages and to discriminate between cognates and non-cognates.
Conclusions and Future Work	and Waterman, 1981), and other learning algorithms for discriminating between cognates and non-cognates.
Our Approach	Therefore, because the edit distance was widely used in this research area and produced good results, we are encouraged to employ orthographic alignment for identifying pairs of cognates, not only to compute similarity scores, as was previously done, but to use aligned subsequences as features for machine learning algorithms .
Our Approach	3.3 Learning Algorithms

learning algorithm is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

7. Representation Learning for Text-level Discourse Parsing

Ji, Yangfeng and Eisenstein, Jacob

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Implementation	The learning algorithm is applied in a shift-reduce parser, where the training data consists of the (unique) list of shift and reduce operations required to produce the gold RST parses.
Introduction	Alternatively, our approach can be seen as a nonlinear learning algorithm for incremental structure prediction, which overcomes feature sparsity through effective parameter tying.
Large-Margin Learning Framework	of our learning algorithm are different.
Large-Margin Learning Framework	Algorithm 1 Mini-batch learning algorithm Input: Training set D, Regularization parameters A and 7', Number of iteration T, Initialization matrix A0, and Threshold 5 whilet = l,...,Tdo

learning algorithm is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

8. Active Learning with Efficient Feature Weighting Methods for Improving Data Quality and Classification Accuracy

Martineau, Justin and Chen, Lu and Cheng, Doreen and Sheth, Amit

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Feature Weighting Methods	ing, better classification and regression models can be built by using the feature weights generated by these models as a pre-weight on the data points for other machine learning algorithms .
Related Work	Noise tolerance techniques aim to improve the learning algorithm itself to avoid over-fitting caused by mislabeled instances in the training phase, so that the constructed classifier becomes more noise-tolerant.
Related Work	Decision tree (Mingers, 1989; Vannoorenberghe and Denoeux, 2002) and boosting (Jiang, 2001; Kalaia and Servediob, 2005; Karmaker and Kwek, 2006) are two learning algorithms that have been investigated in many studies.
Related Work	For example, useful information can be removed with noise elimination, since annotation errors are likely to occur on ambiguous instances that are potentially valuable for learning algorithms .

learning algorithm is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

9. Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora

Qian, Longhua and Hui, Haotian and Hu, Ya'nan and Zhou, Guodong and Zhu, Qiaoming

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This section first introduces the fundamental supervised learning method, and then describes a baseline active learning algorithm .
Abstract	3.2 Active Learning Algorithm
Abstract	4.4 Bilingual Active Learning Algorithm

learning algorithm is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

10. Collective Tweet Wikification based on Semi-supervised Graph Regularization

Huang, Hongzhao and Cao, Yunbo and Huang, Xiaojiang and Ji, Heng and Lin, Chin-Yew

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	In order to address these unique challenges for wikification for the short tweets, we employ graph-based semi-supervised learning algorithms (Zhu et al., 2003; Smola and Kondor, 2003; Blum et al., 2004; Zhou et al., 2004; Talukdar and Crammer, 2009) for collective inference by exploiting the manifold (cluster) structure in both unlabeled and labeled data.
Introduction	effort to explore graph-based semi-supervised learning algorithms for the wikification task.
Semi-supervised Graph Regularization	We propose a novel semi-supervised graph regularization framework based on the graph-based semi-supervised learning algorithm (Zhu et al., 2003):

learning algorithm is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

11. Text-level Discourse Dependency Parsing

Li, Sujian and Wang, Liang and Cao, Ziqiang and Li, Wenjie

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Add arc <eC,ej> to GC with	As we employed the MIRA learning algorithm , it is possible to identify which specific features are useful, by looking at the weights learned to each feature using the training data.
Add arc <eC,ej> to GC with	Other text-level discourse parsing methods include: (1) Percep-coarse: we replace MIRA with the averaged per-ceptron learning algorithm and the other settings are the same with Our-coarse; (2) HILDA-manual and HILDA-seg are from Hernault (2010b)’s work, and their inputted EDUs are from RST-DT and their own EDU segmenter respectively; (3) LeThanh indicates the results given by LeThanh el al.
Add arc <eC,ej> to GC with	We can also see that the averaged perceptron learning algorithm , though simple, can achieve a comparable performance, better than HILDA-manual.

learning algorithm is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

12. Vector space semantics with frequency-driven motifs

Srivastava, Shashank and Hovy, Eduard

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Implicitly, the weight learning algorithm can be seen as a gradient descent procedure minimizing the difference between the scores of highest scoring (Viterbi) state sequences, and the label state sequences.
Introduction	Pseudocode of the learning algorithm for the partially labeled case is given in Algorithm 1.
Introduction	We see that while all three learning algorithms perform better than the baseline, the performance of the purely unsupervised system is inferior to supervised approaches.

learning algorithm is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

13. Steps to Excellence: Simple Inference with Refined Scoring of Dependency Trees

Zhang, Yuan and Lei, Tao and Barzilay, Regina and Jaakkola, Tommi and Globerson, Amir

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related Work	Our work builds on one such approach — SampleRank (Wick et al., 2011), a sampling-based learning algorithm .
Sampling-Based Dependency Parsing with Global Features	We begin with the notation before addressing the decoding and learning algorithms .
Sampling-Based Dependency Parsing with Global Features	ure 4 summarizes the learning algorithm .

learning algorithm is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: