Index of papers in Proc. ACL 2014 that mention
  • word pairs
Fu, Ruiji and Guo, Jiang and Qin, Bing and Che, Wanxiang and Wang, Haifeng and Liu, Ting
Abstract
We identify whether a candidate word pair has hypernym—hyponym relation by using the word-embedding-based semantic projections between words and their hypernyms.
Introduction
Subsequently, we identify whether an unknown word pair is a hypernym—hyponym relation using the projections (Section 3.4).
Method
Table l: Embedding offsets on a sample of hypernym—hyponym word pairs .
Method
Looking at the well-known example: v(king) — v(queen) % v(man) —v(woman), it indicates that the embedding offsets indeed represent the shared semantic relation between the two word pairs .
Method
As a preliminary experiment, we compute the embedding offsets between some randomly sampled hypernym—hyponym word pairs and measure their similarities.
word pairs is mentioned in 21 sentences in this paper.
Topics mentioned in this paper:
Hu, Yuening and Zhai, Ke and Eidelman, Vladimir and Boyd-Graber, Jordan
Experiments
The prior tree has about 1000 word pairs (dict).
Experiments
We then remove the word pairs appearing more than 50K times or fewer than 500 times and construct a second prior tree with about 2500 word pairs (align).
Polylingual Tree-based Topic Models
Figure 1: An example of constructing a prior tree from a bilingual dictionary: word pairs with the same meaning but in different languages are concepts; we create a common parent node to group words in a concept, and then connect to the root; un-correlated words are connected to the root directly.
Polylingual Tree-based Topic Models
The word pairs define concepts for the prior tree (align).
Topic Models for Machine Translation
The phrase pair probabilities pw (6| f) are the normalized product of lexical probabilities of the aligned word pairs within that phrase pair (Koehn et al., 2003).
Topic Models for Machine Translation
where cd(o) is the number of occurrences of the word pair in document d. The lexical probability conditioned on topic k is the unsmoothed probability estimate of those expected counts
Topic Models for Machine Translation
While vanilla topic models (LDA) can only be applied to monolingual data, there are a number of topic models for parallel corpora: Zhao and Xing (2006) assume aligned word pairs share same topics; Mimno et al.
word pairs is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Silberer, Carina and Lapata, Mirella
Experimental Setup
4435 word pairs constitute the overlap between Nelson et al.’s norms (1998) and McRae et al.’s (2005) nouns.
Experimental Setup
This resulted in 7,576 word pairs for which we obtained similarity ratings using Amazon Mechanical Turk (AMT).
Experimental Setup
Word Pairs Semantic Visual
Introduction
We performed a large-scale evaluation on a new dataset consisting of human similarity judgments for 7,576 word pairs .
Results
Table 4: Word pairs with highest semantic and visual similarity according to SAE model.
Results
Table 4 shows examples of word pairs with highest semantic and visual similarity according to the SAE model.
word pairs is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Rothe, Sascha and Schütze, Hinrich
Extensions
We use a seed dictionary of 12,630 word pairs to establish node-node correspondences between the two graphs.
Extensions
As the seed dictionary contains 12,630 word pairs , this means that only every fourth entry of the PPR vector (the German graph has 47,439 nodes) is used for similarity calculation.
Extensions
synonym extraction lexicon extraction (68 word pairs) (1000 word pairs )
word pairs is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming
Phrase Pair Embedding
Word 1G 500K 20 X 500K Word Pair 7M (500K)2 20 X (500K)2 Phrase Pair 7M (500104 20 X (500104
Phrase Pair Embedding
For word pair and phrase pair embedding, the numbers are calculated on IWSLT 2009 dialog training set.
Phrase Pair Embedding
But for source-target word pair , we may only have 7M bilingual corpus for training (taking IWSLT data set as an example), and there are 20 ><(500K)2 parameters to be tuned.
word pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: