Index of papers in Proc. ACL 2014 that mention
  • word representations
Nguyen, Thien Huu and Grishman, Ralph
Experiments
2We have the same observation as Plank and Moschitti (2013) that when the gold-standard labels are used, the impact of word representations is limited since the gold-standard information seems to dominate.
Experiments
However, whenever the gold labels are not available or inaccurate, the word representations would be useful for improving adaptability performance.
Experiments
This section examines the effectiveness of word representations for RE across domains.
Introduction
The application of word representations such
Introduction
as word clusters in domain adaptation of RE (Plank and Moschitti, 2013) is motivated by its successes in semi-supervised methods (Chan and Roth, 2010; Sun et al., 2011) where word representations help to reduce data-sparseness of lexical information in the training data.
Introduction
In DA terms, since the vocabularies of the source and target domains are usually different, word representations would mitigate the lexical sparsity by providing general features of words that are shared across domains, hence bridge the gap between domains.
Regularization
Given the more general representations provided by word representations above, how can we learn a relation extractor from the labeled source domain data that generalizes well to new domains?
Regularization
In fact, this setting can benefit considerably from our general approach of applying word representations and regularization.
Word Representations
We consider two types of word representations and use them as additional features in our DA system, namely Brown word clustering (Brown et al., 1992) and word embeddings (Bengio et al., 2001).
word representations is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Ma, Ji and Zhang, Yue and Zhu, Jingbo
Experiments
As mentioned in Section 3.1, the knowledge learned from the WRRBM can be investigated incrementally, using word representation , which corresponds to initializing only the projection layer of web-feature module with the projection matrix of the learned WRRBM, or ngram-level representation, which corresponds to initializing both the projection and sigmoid layers of the web-feature module by the learned WRRBM.
Experiments
“word” and “ngram” denote using word representations and n—gram representations, respectively.
Experiments
From Figures 2, 3 and 4, we can see that adopting the ngram-level representation consistently achieves better performance compared with using word representations only (“word-fixed” vs “ngram-fixed”, “word-adjusted” vs “ngram-adjusted”).
Learning from Web Text
We utilize the Word Representation RBM (WRRBM) factorization proposed by Dahl et al.
Learning from Web Text
The basic idea is to share word representations across different positions in the input n-gram while using position-dependent weights to distinguish between different word orders.
Learning from Web Text
,W(n)} can be trained using a Metropolis-Hastings-based CD variant and the learned word representations also capture certain syntactic information; see Dahl et al.
Neural Network for POS Disambiguation
We can choose to use only the word representations of the learned WRRBM.
word representations is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Tsvetkov, Yulia and Boytsov, Leonid and Gershman, Anatole and Nyberg, Eric and Dyer, Chris
Experiments
(2013) in that it uses additional features (vector space word representations ) and a different classification method (we use random forests while Tsvetkov et al.
Methodology
We define three main feature categories (1) abstractness and imageability, (2) supersenses, (3) unsupervised vector-space word representations ; each category corresponds to a group of features with a common theme and representation.
Methodology
0 Vector space word representations .
Methodology
Vector space word representations learned using unsupervised algorithms are often effective features in supervised learning methods (Turian et al., 2010).
Model and Feature Extraction
Vector space word representations .
Model and Feature Extraction
We employ 64-dimensional vector-space word representations constructed by Faruqui and Dyer (2014).14 Vector construction algorithm is a variation on traditional latent semantic analysis (Deerwester et al., 1990) that uses multilingual information to produce representations in which synonymous words have similar vectors.
word representations is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Hermann, Karl Moritz and Blunsom, Phil
Introduction
Within a monolingual context, the distributional hypothesis (Firth, 1957) forms the basis of most approaches for learning word representations .
Introduction
Unlike most methods for learning word representations , which are restricted to a single language, our approach learns to represent meaning across languages in a shared multilingual semantic space.
Related Work
Neural language models are another popular approach for inducing distributed word representations (Bengio et al., 2003).
Related Work
Unsupervised word representations can easily be plugged into a variety of NLP related tasks.
Related Work
Hermann and Blunsom (2014) propose a large-margin learner for multilingual word representations , similar to the basic additive model proposed here, which, like the approaches above, relies on a bag-of-words model for sentence representations.
word representations is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Tang, Duyu and Wei, Furu and Yang, Nan and Zhou, Ming and Liu, Ting and Qin, Bing
Abstract
Most existing algorithms for learning continuous word representations typically only model the syntactic context of words but ignore the sentiment of text.
Introduction
Accordingly, it is a crucial step to learn the word representation (or word embedding), which is a dense, low-dimensional and real-valued vector for a word.
Related Work
However, the one-hot word representation cannot sufficiently capture the complex linguistic characteristics of words.
Related Work
The results of bag-of-ngram (unUbUtri-gram) features are not satisfied because the one-hot word representation cannot capture the latent connections between words.
Related Work
In this paper, we propose learning continuous word representations as features for Twitter sentiment classification under a supervised learning framework.
word representations is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Hermann, Karl Moritz and Das, Dipanjan and Weston, Jason and Ganchev, Kuzman
Abstract
Given labeled data annotated with frame-semantic parses, we learn a model that projects the set of word representations for the syntactic context around a predicate to a low dimensional representation.
Introduction
We present a new technique for semantic frame identification that leverages distributed word representations .
Introduction
h Distributed Word Representations
word representations is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: