Abstract | We present CoSimRank, a graph-theoretic similarity measure that is efficient because it can compute a single node similarity without having to compute the similarities of the entire graph. |
Abstract | Another advantage of CoSimRank is that it can be flexibly extended from basic node-node similarity to several other graph-theoretic similarity measures . |
Introduction | }raph-The0retic Similarity Measure |
Related Work | Apart from SimRank, many other similarity measures have been proposed. |
Related Work | (2006) introduce a similarity measure that is also based on the idea that nodes are similar when their neighbors are, but that is designed for bipartite graphs. |
Related Work | Another important similarity measure is cosine similarity of Personalized PageRank (PPR) vectors. |
Abstract | Our approach leverages a similarity measure that enables the structural comparison of senses across lexical resources, achieving state-of-the-art performance on the task of aligning WordNet to three different collaborative resources: Wikipedia, Wiktionary and OmegaWiki. |
Conclusions | Our method leverages a novel similarity measure which enables a direct structural comparison of concepts across different lexical resources. |
Conclusions | In future work, we plan to extend our concept similarity measure across different natural languages. |
Experiments | 4.3 Similarity Measure Analysis |
Experiments | We explained in Section 2.1 that our concept similarity measure consists of two components: the definitional and the structural similarities. |
Experiments | structural similarity measure in comparison to the Dijkstra-WSA method, we carried out an experiment where our alignment system used only the structural similarity component, a variant of our system we refer to as SemAlignStr. |
Lexical Resource Ontologization | To do this, we apply our definitional similarity measure introduced in Section 2.1. |
Resource Alignment | Figure 1 illustrates the procedure underlying our cross-resource concept similarity measurement technique. |
Resource Alignment | The structural similarity component, instead, is a novel graph-based similarity measurement technique which calculates the similarity between a pair of concepts across the semantic networks of the two resources by leveraging the semantic |
Discussion | One of the most effective similarity measures is the cosine similarity, which is a normalized dot product. |
Discussion | In order to appreciate the effect of these advantages, we perform an experiment that takes H to be the set of all LCs of size 1, and uses a single similarity measure . |
Our Proposal: A Latent LC Approach | where sim is some vector similarity measure . |
Our Proposal: A Latent LC Approach | We use two common similarity measures: the vector cosine metric, and the BInc (Szpektor and Dagan, 2008) similarity measure . |
Our Proposal: A Latent LC Approach | To do so, we use point-wise mutual information, and the conditional probabilities P(hf|hf) and POLE Similar measures have often been used for the unsupervised detection of MWEs (Villavicencio et al., 2007; Fazly and Stevenson, 2006). |
Domain Adaptation | For both POS tagging and sentiment classification, we experimented with several alternative approaches for feature weighting, representation, and similarity measures using development data, which we randomly selected from the training instances from the datasets described in Section 5. |
Domain Adaptation | With respect to similarity measures, we experimented with cosine similarity and the similarity measure proposed by Lin (1998); cosine similarity performed consistently well over all the experimental settings. |
Domain Adaptation | The feature representation was held fixed during these similarity measure comparisons. |
O \ | As an example of the distribution prediction method, in Table 3 we show the top 3 similar distributional features u in the books (source) domain, predicted for the electronics (target) domain word 21) = lightweight, by different similarity measures . |
Conclusion | Moreover, our approach can combine two similarity measures in a hybrid hashing scheme, which is beneficial to comprehensively modeling the document similarity. |
Document Retrieval with Hashing | Given a query document vector q, we use the Cosine similarity measure to evaluate the similarity between q and a document a: in a dataset: |
Document Retrieval with Hashing | Enable a hybrid hashing scheme combining two similarity measures . |
Introduction | Furthermore, we make the hashing framework applicable to combine different similarity measures in NNS. |
Experiments: predicting relevance in context | Figure 3: Precision and recall on relevant links with respect to a threshold on the similarity measure (Lin’s score) |
Experiments: predicting relevance in context | A straightforward parameter to include to predict the relevance of a link is of course the similarity measure itself, here Lin’s information measure. |
Experiments: predicting relevance in context | This is already a big improvement on the use of the similarity measure alone (24%). |
Introduction | A distributional thesaurus is a lexical network that lists semantic neighbours, computed from a corpus and a similarity measure between lexical items, which generally captures the similarity of contexts in which the items occur. |
Introduction | They are used as matching sequences to locate corresponding candidate entries in the KB, and then to disambiguate those candidates using similarity measures . |
Introduction | This is usually done using similarity measures (such as cosine similarity, weighted J accard distance, KL divergence...) that evaluate the distance between a bag of words related to a candidate annotation, and the words surrounding the entity to annotate in the text. |
Related Work | It proposes a disambiguation method that combines popularity-based priors, similarity measures , and coherence. |