A <— METRICLEARNER(X, 3,1?) | We also experimented with the supervised large-margin metric learning algorithm (LMNN) presented in (Weinberger and Saul, 2009). |
Abstract | We initiate a study comparing effectiveness of the transformed spaces learned by recently proposed supervised, and semi-supervised metric learning algorithms to those generated by previously proposed unsupervised dimensionality reduction methods (e.g., PCA). |
Abstract | Through a variety of experiments on different real-world datasets, we find IDML—IT, a semi-supervised metric learning algorithm to be the most effective. |
Conclusion | In this paper, we compared the effectiveness of the transformed spaces learned by recently proposed supervised, and semi-supervised metric learning algorithms to those generated by previously proposed unsupervised dimensionality reduction methods (e.g., PCA). |
Introduction | Recently, several supervised metric learning algorithms have been proposed (Davis et al., 2007; Weinberger and Saul, 2009). |
Introduction | Even though different supervised and semi-supervised metric learning algorithms have recently been proposed, effectiveness of the transformed spaces learned by them in NLP |
Introduction | We find IDML-IT, a semi-supervised metric learning algorithm to be the most effective. |
Metric Learning | We shall now review two recently proposed metric learning algorithms . |
Metric Learning | The ITML metric learning algorithm , which we reviewed in Section 2.2, is supervised in nature, and hence it does not exploit widely available unlabeled data. |
Metric Learning | In this section, we review Inference Driven Metric Learning (IDML) (Algorithm 1) (Dhillon et al., 2010), a recently proposed metric learning framework which combines an existing supervised metric learning algorithm (such as ITML) along with transductive graph-based label inference to learn a new distance metric from labeled as well as unlabeled data combined. |
Abstract | We learn this correspondence with a reinforcement learning algorithm , using the deviation of the route we follow from the intended path as a reward signal. |
Approximate Dynamic Programming | To learn these weights 6 we use SARSA (Sutton and Barto, 1998), an online learning algorithm similar to Q-learning (Watkins and Dayan, 1992). |
Approximate Dynamic Programming | Algorithm 1 details the learning algorithm , which we follow here. |
Approximate Dynamic Programming | We also compare against the policy gradient learning algorithm of Branavan et al. |
Introduction | Solved using a reinforcement learning algorithm , our system acquires the meaning of spatial words through |
Introduction | We frame direction following as an apprenticeship learning problem and solve it with a reinforcement learning algorithm , extending previous work on interpreting instructions by Branavan et al. |
Reinforcement Learning Formulation | 2Our learning algorithm is not dependent on a determin- |
Reinforcement Learning Formulation | Learning exactly which words influence decision making is difficult; reinforcement learning algorithms have problems with the large, sparse feature vectors common in natural language processing. |
Introduction | We further propose a specific hierarchical learning algorithm , called HL-SOT algorithm, which is developed based on generalizing an online-leaming algorithm H-RLS (Cesa—Bianchi et al., 2006). |
Introduction | 0 A specific hierarchical learning algorithm is |
Related Work | In (Turney, 2002), an unsupervised learning algorithm was proposed to classify reviews as recommended or not recommended by averaging sentiment annotation of phrases in reviews that contain adjectives or adverbs. |
The HL-SOT Approach | Then a specific hierarchical learning algorithm is further proposed to solve the formulated problem. |
The HL-SOT Approach | Therefore we propose a specific hierarchical learning algorithm , named HL-SOT algorithm, that is able to train each node classifier in a batch-learning setting and allows separately learning for the threshold of each node classifier. |
The HL-SOT Approach | Then the hierarchical classification function f is parameterized by the weight matrix W = (7.01, ..., wN)T and threshold vector 6 = (61, ..., 6N)T. The hierarchical learning algorithm HL-SOT is proposed for learning the parameters of W and 6. |
Abstract | We present an efficient approximate approach for learning this environment model as part of a policy-gradient reinforcement learning algorithm for text interpretation. |
Algorithm | The learning algorithm is provided with a set of documents d E D, an environment in which to execute command sequences 5’, and a reward function The goal is to estimate two sets of parameters: 1) the parameters 6 of the policy function, and 2) the partial environment transition model q(5’ |5 , c), which is the observed portion of the true model 19(5’ |5, c). |
Introduction | Our method efficiently achieves both of these goals as part of a policy-gradient reinforcement learning algorithm . |
Related Work | Interpreting Instructions Our approach is most closely related to the reinforcement learning algorithm for mapping text instructions to commands developed by Branavan et al. |
Related Work | We address this limitation by expanding a policy learning algorithm to take advantage of a partial environment model estimated during learning. |
Discussion | Each model describes the diachronic, population-level consequences of assuming a particular learning algorithm for individuals. |
Discussion | By using simple models, we were able to consider a range of learning algorithms corresponding to different explanations for the observed diachronic dynamics. |
Modeling preliminaries | This setting allows us to determine the diachronic, population-level consequences of assumptions about the learning algorithm used by individuals, as well as assumptions about population structure or the input they receive. |
Models | We now describe 5 DS models, each corresponding to a learning algorithm A used by individual language learners. |
Models | The models differ along two dimensions, corresponding to assumptions about the learning algorithm (A): whether or not it is assumed that the stress of examples is possibly mistransmitted (Models 1, 3, 5), and how the N and V probabil- |
Comparing the two Datasets | Table 2: Performance of several machine learning algorithms on the English TempEval-l training data, with cross-validation. |
Comparing the two Datasets | Table 3: Performance of several machine learning algorithms on the Portuguese data for the TempEval-1 tasks. |
Introduction | The results of machine learning algorithms over the data thus obtained are compared to those reported for the English TempEval-l competition. |
Extraction with Lexicons | A learning algorithm expands the seed phrases into a set of lexicons. |
Extraction with Lexicons | The semantic lexicons are added as features to the CRF learning algorithm . |
Introduction | When learning an extractor for relation R, LUCHS extracts seed phrases from R’s training data and uses a semi-supervised learning algorithm to create several relation-specific lexicons at different points on a precision-recall spectrum. |
Introduction | We then present the OntoUSP Markov logic network and the inference and learning algorithms used with it. |
Unsupervised Ontology Induction with Markov Logic | Finally, we describe the learning algorithm and how OntoUSP induces the ontology while learning the semantic parser. |
Unsupervised Ontology Induction with Markov Logic | Algorithm 2 gives pseudo-code for OntoUSP’s learning algorithm . |
Experimental Setup | Training We obtained phrase-based salience scores using a supervised machine learning algorithm . |
Modeling | We obtain these scores from the output of a supervised machine learning algorithm that predicts for each phrase whether it should be included in the highlights or not (see Section 5 for details). |
Modeling | Let fi denote the salience score for phrase i, determined by the machine learning algorithm , and li is its length in tokens. |
Experiments | The CRF extractors are trained using the same learning algorithm and feature selection as TextRunner. |
Introduction | 0 Using the same learning algorithm and features as TextRunner, we compare four different ways to generate positive and negative training data with TextRunner’s method, concluding that our Wikipedia heuristic is responsible for the bulk of WOE’s improved accuracy. |
Wikipedia-based Open IE | WOEPOS uses the same learning algorithm and selection of features as TextRunner: a two-order CRF chain model is trained with the Mallet package (McCallum, 2002). |