Algorithm | where Pr(w1, 2122) is the co-occurrence count, and Pr(wi) is the total number of appearance of w,-in the corpus (Church and Hanks, 1990). |
Conclusion | Our results confirm that alignment is problematic in using co-occurrence methods across languages, at least in our settings. |
Introduction | While co-occurrence scores are used to compute signatures, signatures, unlike context vectors, do not contain the score values. |
Lexicon Generation Experiments | In the case of context vectors, the vector indices, or keys, are words, and their values are co-occurrence based scores. |
Lexicon Generation Experiments | The window size for co-occurrence counting, k, was 4. |
Lexicon Generation Experiments | In the three co-occurrence based methods, NAS similarity, cosine distance and and city block distance, the highest ranking translation was selected. |
Previous Work | (2009) replaced the traditional window-based co-occurrence counting with dependency-tree based counting, while Pekar et al. |
Previous Work | (2006) predicted missing co-occurrence values based on similar words in the same language. |
Experiments | There were three knowledge sources we used for our experiments: the WordNet 3.0; the Sep. 9, 2007 English version of Wikipedia; and the Web pages of each ambiguous name in WePS datasets as the NE Co-occurrence Corpus. |
Introduction | This model measures similarity based on only the co-occurrence statistics of terms, without considering all the semantic relations like social relatedness between named entities, associative relatedness between concepts, and lexical relatedness (e.g., acronyms, synonyms) between key terms. |
Related Work | (2007) used the co-occurrence statistics between named entities in the Web. |
The Structural Semantic Relatedness Measure | We extract three types of semantic relations (semantic relatedness between Wikipedia concepts, lexical relatedness between WordNet concepts and social relatedness between NEs) correspondingly from three knowledge sources: Wikipedia, WordNet and NE Co-occurrence Corpus. |
The Structural Semantic Relatedness Measure | NE Co-occurrence Corpus, a corpus of documents for capturing the social relatedness between named entities. |
The Structural Semantic Relatedness Measure | According to the fuzzy set theory (Baeza-Yates et al., 1999), the degree of named entities co-occurrence in a corpus is a measure of the relatedness between them. |
Experiments | The spin model approach uses word glosses, WordNet synonym, hypernym, and antonym relations, in addition to co-occurrence statistics extracted from corpus. |
Experiments | Adding co-occurrence statistics slightly improved performance, while using glosses did not help at all. |
Experiments | No glosses or co-occurrence statistics are used. |
Related Work | To get co-occurrence statistics, they submit several queries to a search engine. |
Related Work | They construct a network of words using gloss definitions, thesaurus, and co-occurrence statistics. |
Word Polarity | Another source of links between words is co-occurrence statistics from corpus. |
Word Polarity | We study the effect of using co-occurrence statistics to connect words later at the end of our experiments. |
The S-Space Framework | We divide the algorithms into four categories based on their structural similarity: document-based, co-occurrence , approximation, and Word Sense Induction (WSI) models. |
The S-Space Framework | Co-occurrence models build the vector space using the distribution of co-occurring words in a context, which is typically defined as a region around a word or paths rooted in a parse tree. |
The S-Space Framework | co-occurrence data rather than model it explicitly in order to achieve better scalability for larger data sets. |
Word Space Models | Later models have expanded the notion of co-occurrence but retain the premise that distributional similarity can be used to extract meaningful relationships between words. |
Word Space Models | Common approaches use a lexical distance, syntactic relation, or document co-occurrence to define the context. |
Word Space Models | Co-occurrence Models HAL (Burgess and Lund, 1997) COALS (Rohde et a1., 2009) |
Experiments: Ranking Paraphrases | As for the full model, we use pmi values rather than raw frequency counts as co-occurrence statistics. |
Introduction | In the standard approach, word meaning is represented by feature vectors, with large sets of context words as dimensions, and their co-occurrence frequencies as values. |
Introduction | This allows us to model the semantic interaction between the meaning of a head word and its dependent at the microlevel of relation-specific co-occurrence frequencies. |
Related Work | Figure l: Co-occurrence graph of a small sample corpus of dependency trees. |
The model | The basis for the construction of both kinds of vector representations are co-occurrence graphs. |
The model | Figure 1 shows the co-occurrence graph of a small sample corpus of dependency trees: Words are represented as nodes in the graph, possible dependency relations between them are drawn as labeled edges, with weights corresponding to the observed frequencies. |
The model | introduce another kind of vectors capturing infor-mations about all words that can be reached with two steps in the co-occurrence graph. |
Models of Processing Difficulty | To give a concrete example, Latent Semantic Analysis (LSA, Landauer and Dumais 1997) creates a meaning representation for words by constructing a word-document co-occurrence matrix from a large collection of documents. |
Models of Processing Difficulty | Like LSA, ICD is based on word co-occurrence vectors, however it does not employ singular value decomposition, and constructs a word-word rather than a word-document co-occurrence matrix. |
Models of Processing Difficulty | Importantly, composition models are not defined with a specific semantic space in mind, they could easily be adapted to LSA, or simple co-occurrence vectors, or more sophisticated semantic representations (e.g., Griffiths et al. |
Experimental setup | Following their description, we use a 2,000-dimensional space of syntactic co-occurrence features appropriate to the relation being predicted, weight features with the G2 transformation and compute similarity with the cosine measure. |
Results | 30 predicates were selected for each relation; each predicate was matched with three arguments from different co-occurrence bands in the BNC, e.g., naughty-girl (high frequency), naughty-dog (medium) and naughty-lunch (low). |
Three selectional preference models | Further differences are that information about predicate-argument co-occurrence is only shared within a given interaction class rather than across the whole dataset and that the distribution (Dz is not specific to the predicate 2) but rather to the relation 7“. |