Index of papers in Proc. ACL 2012 that mention
  • embeddings
Huang, Eric and Socher, Richard and Manning, Christopher and Ng, Andrew
Abstract
We present a new neural network architecture which 1) learns word embeddings that better capture the semantics of words by incorporating both local and global document context, and 2) accounts for homonymy and polysemy by learning multiple embeddings per word.
Experiments
In this section, we first present a qualitative analysis comparing the nearest neighbors of our model’s embeddings with those of others, showing our embeddings better capture the semantics of words, with the use of global context.
Experiments
For all experiments, our models use 50-dimensional embeddings .
Experiments
The nearest neighbors of “market” that C&W’s embeddings give are more constrained by the syntactic constraint that words in plural form are only close to other words in plural form, whereas our model captures that the singular and plural forms of a word are similar in meaning.
Global Context-Aware Neural Language Model
ng = Z maX(0, 1 — 9(8, d) + g(sw, d)) (l) wEV Collobert and Weston (2008) showed that this ranking approach can produce good word embeddings that are useful in several NLP tasks, and allows much faster training of the model compared to optimizing log-likelihood of the next word.
Global Context-Aware Neural Language Model
where [£131,132, ...,;vm] is the concatenation of the m word embeddings representing sequence 8, f is an element-wise activation function such as tanh, a1 6 Rh“ is the activation of the hidden layer with h hidden nodes, W1 6 WW) and W2 6 nlxh are respectively the first and second layer weights of the neural network, and b1, ()2 are the biases of each layer.
Global Context-Aware Neural Language Model
For the score of the global context, we represent the document also as an ordered list of word embeddings , d 2 (d1, d2, ..., dk).
Multi-Prototype Neural Language Model
We present a way to use our learned single-prototype embeddings to represent each context window, which can then be used by clustering to perform word sense discrimination (Schutze, 1998).
embeddings is mentioned in 19 sentences in this paper.
Topics mentioned in this paper: