Abstract | We build a semantic similarity graph to encode lexical semantic clue, and employ a convolutional neural model to capture contextual semantic clue. |
Introduction | Then, based on the assumption that terms that are more semantically similar to the seeds are more likely to be product features, a graph which measures semantic similarities between terms is built to capture lexical semantic clue. |
Introduction | 0 It exploits semantic similarity between words to capture lexical clues, which is shown to be more effective than co-occurrence relation between words and syntactic patterns. |
Introduction | In addition, experiments show that the semantic similarity has the advantage of mining infrequent product features, which is crucial for this task. |
The Proposed Method | Then, a semantic similarity graph is created to capture lexical semantic clue, and a Convolutional Neural Network (CNN) (Collobert et al., 2011) is trained in each bootstrapping iteration to encode contextual semantic clue. |
The Proposed Method | 3.2 Capturing Lexical Semantic Clue in a Semantic Similarity Graph |
The Proposed Method | 3.2.2 Building the Semantic Similarity Graph |
Abstract | We evaluate our proposed method on two end-to-end SMT tasks (phrase table pruning and decoding with phrasal semantic similarities) which need to measure semantic similarity between a source phrase and its translation candidates. |
Experiments | With the semantic phrase embeddings and the vector space transformation function, we apply the BRAE to measure the semantic similarity between a source phrase and its translation candidates in the phrase-based SMT. |
Experiments | Two tasks are involved in the experiments: phrase table pruning that discards entries whose semantic similarity is very low and decoding with the phrasal semantic similarities as additional new features. |
Experiments | 3To avoid the situation that all the translation candidates for a source phrase are pruned, we always keep the first 10 best according to the semantic similarity . |
Introduction | With the learned model, we can accurately measure the semantic similarity between a source phrase and a translation candidate. |
Introduction | Accordingly, we evaluate the BRAE model on two end-to-end SMT tasks (phrase table pruning and decoding with phrasal semantic similarities ) which need to check whether a translation candidate and the source phrase are in the same meaning. |
Introduction | In phrase table pruning, we discard the phrasal translation rules with low semantic similarity . |
Abstract | By providing an empirical measure of semantic similarity between words derived from lexical co-occurrences, distributional semantics not only reliably captures how the verbs in the distribution of a construction are related, but also enables the use of visualization techniques and statistical modeling to analyze the semantic development of a construction over time and identify the semantic determinants of syntactic productivity in naturally occurring data. |
Application of the vector-space model | One of the advantages conferred by the quantification of semantic similarity is that lexical items can be precisely considered in relation to each other, and by aggregating the similarity information for all items in the distribution, we can produce a visual representation of the structure of the semantic domain of the construction in order to observe how verbs in that domain are related to each other, and to immediately identify the regions of the semantic space that are densely populated (with tight clusters of verbs), and those that are more sparsely populated (fewer and/or more scattered verbs). |
Application of the vector-space model | With the quantification of semantic similarity provided by the distributional semantic model, it is also possible to properly test the hypothesis that productivity is tied to the structure of the semantic space. |
Conclusion | This paper reports the first attempt at using a distributional measure of semantic similarity derived from a vector-space model for the study of syntactic productivity in diachrony. |
Conclusion | Not only does distributional semantics provide an empirically-based measure of semantic similarity that appropriately captures semantic distinctions, it also enables the use of methods for which quantification is necessary, such as data visualization and statistical analysis. |
Distributional measure of semantic similarity | One benefit of the distributional semantics approach is that it allows semantic similarity between words to be quantified by measuring the similarity in their distribution. |
Distributional measure of semantic similarity | According to Sahlgren (2008), this kind of model captures to what extent words can be substituted for each other, which is a good measure of semantic similarity between verbs. |
Distributional measure of semantic similarity | In order to make sure that enough distributional information is available to reliably assess semantic similarity , verbs with less than 2,000 occurrences were excluded, which left 92 usable items (out of 105). |
Introduction | In this paper, I present a third alternative that takes advantage of advances in computational linguistics and draws on a distributionally-based measure of semantic similarity . |
The hell-construction | To answer these questions, I will analyze the distribution of the construction from a semantic point of view by using a measure of semantic similarity derived from distributional information. |
Experimental Setup | Next, for each word we randomly selected 30 pairs under the assumption that they are representative of the full variation of semantic similarity . |
Experimental Setup | Participants were asked to rate a pair on two dimensions, visual and semantic similarity using a Likert scale of 1 (highly dissimilar) to 5 (highly similar). |
Experimental Setup | For semantic similarity , the mean correlation was 0.76 (Min 20.34, Max |
Results | We would expect the textual modality to be more dominant when modeling semantic similarity and conversely the perceptual modality to be stronger with respect to visual similarity. |
Results | The textual SAE correlates better with semantic similarity judgments (p = 0.65) than its visual equivalent (p = 0.60). |
Results | It yields a correlation coefficient of p = 0.70 on semantic similarity and p = 0.64 on visual similarity. |
Abstract | judging the semantic similarity of natural-language sentences), and show that PSL gives improved results compared to a previous approach based on Markov Logic Networks (MLNs) and a purely distributional approach. |
Background | Distributional models (Turney and Pantel, 2010), on the other hand, use statistics on contextual data from large corpora to predict semantic similarity of words and phrases (Landauer and Dumais, 1997; Mitchell and Lapata, 2010). |
Background | Distributional models are motivated by the observation that semantically similar words occur in similar contexts, so words can be represented as vectors in high dimensional spaces generated from the contexts in which they occur (Landauer and Dumais, 1997; Lund and Burgess, 1996). |
Background | (2013) use MLNs to represent the meaning of natural language sentences and judge textual entailment and semantic similarity , but they were unable to scale the approach beyond short sentences due to the complexity of MLN inference. |
Evaluation | More specifically, they strongly indicate that PSL is a more effective probabilistic logic for judging semantic similarity than MLNs. |
Approach | We exploit this semantic similarity across languages by defining a bilingual (and trivially multilingual) energy as follows. |
Experiments | Even though the model did not use any parallel French-German data during training, it learns semantic similarity between these two languages using English as a pivot, and semantically clusters words across all languages. |
Related Work | Very simple composition functions have been shown to suffice for tasks such as judging bi-gram semantic similarity (Mitchell and Lapata, 2008). |
Related Work | Their architecture op-timises the cosine similarity of documents, using relative semantic similarity scores during learning. |
Introduction | Instead of procuring explicit representations, the kernel paradigm directly focuses on the larger goal of quantifying semantic similarity of larger linguistic units. |
Introduction | Figure l: Tokenwise syntactic and semantic similarities don’t imply sentential semantic similarity |
Introduction | With such neighbourhood contexts, the distributional paradigm posits that semantic similarity between a pair of motifs can be given by a sense of ‘distance’ between the two distributions. |
Abstract | In a quantitative evaluation on the task of judging geographically informed semantic similarity between representations learned from 1.1 billion words of geo-located tweets, our joint model outperforms comparable independent models that learn meaning in isolation. |
Evaluation | We evaluate our model by confirming its face validity in a qualitative analysis and estimating its accuracy at the quantitative task of judging geographically-informed semantic similarity . |
Evaluation | As a quantitative measure of our model’s performance, we consider the task of judging semantic similarity among words whose meanings are likely to evoke strong geographical correlations. |
Abstract | We experimentally demonstrate that the discourse structure of non-factoid answers provides information that is complementary to lexical semantic similarity between question and answer, improving performance up to 24% (relative) over a state-of-the-art model that exploits lexical semantic similarity alone. |
CR + LS + DMM + DPM 39.32* +24% 47.86* +20% | This way, the DMM and DPM features jointly capture discourse structures and semantic similarity between answer segments and question. |
CR + LS + DMM + DPM 39.32* +24% 47.86* +20% | Empirically we show that modeling answer discourse structures is complementary to modeling lexical semantic similarity and that the best performance is obtained when they are tightly integrated. |
Model | Turkers were required to type in their best guess, and the number of semantically similar guesses were counted by an average number of 6 other turkers. |
Model | A ratio of the median of semantically similar guesses to the total number of guesses was then taken as the score representing “predictability” of the word being guessed in the given context. |
Model | Turkers that judged the semantic similarity of the guesses of other turkers achieved an average Cohen’s kappa agreement of 0.44, indicating fair to poor agreement. |
Abstract | We first set up a human annotation of semantic links with or without contextual information to show the importance of the textual context in evaluating the relevance of semantic similarity , and to assess the prevalence of actual semantic relations between word tokens. |
Conclusion | We proposed a method to reliably evaluate distributional semantic similarity in a broad sense by considering the validation of lexical pairs in contexts where they both appear. |
Introduction | We hypothetize that evaluating and filtering semantic relations in texts where lexical items occur would help tasks that naturally make use of semantic similarity relations, but assessing this goes beyond the present work. |
Introduction | Deeper approaches leverage semantic similarity to go beyond the surface realization of definitions (Navigli, 2006; Meyer and Gurevych, 2011; Niemann and Gurevych, 2011). |
Resource Alignment | These two scores are then combined into an overall score (part (e) of Figure 1) which quantifies the semantic similarity of the two input concepts 01 and 02. |
Resource Alignment | PPR has been previously used in a wide variety of tasks such as definition similarity-based resource alignment (Niemann and Gurevych, 2011), textual semantic similarity (Hughes and Ramage, 2007; Pilehvar et al., 2013), Word Sense Disambiguation (Agirre and Soroa, 2009; Faralli and Navigli, 2012) and semantic text categorization (Navigli et al., 2011). |