Experiments | In both tasks we compare our model’s word representations with several bag of words weighting methods, and alternative approaches to word vector induction. |
Experiments | 4.1 Word Representation Learning |
Experiments | We induce word representations with our model using 25,000 movie reviews from IMDB. |
Introduction | Word representations are a critical component of many natural language processing systems. |
Our Model | To capture semantic similarities among words, we derive a probabilistic model of documents which learns word representations . |
Our Model | The energy function uses a word representation matrix R E R“ X M) where each word 21) (represented as a one-on vector) in the vocabulary V has a 6-dimensional vector representation gbw = Rw corresponding to that word’s column in R. The random variable 6 is also a B-dimensional vector, 6 E R5 which weights each of the 6 dimensions of words’ representation vectors. |
Our Model | We introduce a Frobenious norm regularization term for the word representation matrix R. The word biases b are not regularized reflecting the fact that we want the biases to capture whatever overall word frequency statistics are present in the data. |