Experimental Settings | Table 2: Context-sensitive similarity scores (in bold) for the Y slots of four rule applications. |
Introduction | Rather than computing a single context-insensitive rule score, we compute a distinct word-level similarity score for each topic in an LDA model. |
Two-level Context-sensitive Inference | At learning time, we compute for each candidate rule a separate, topic-biased, similarity score per each of the topics in the LDA model. |
Two-level Context-sensitive Inference | Then, at rule application time, we compute an overall reliability score for the rule by combining the per-topic similarity scores , while biasing the score combination according to the given context of 212. |
Two-level Context-sensitive Inference | sim/3m), we compute a topic-biased similarity score for each LDA topic 75, denoted by simt(v, v’ simt(v, v’) is computed by applying |
Extending with non-wordnet data | Meyer and Gurevych (2011) showed that automatic alignments between Wiktionary senses and PWN can be established with reasonable accuracy and recall by combining multiple text similarity scores to compare a bag of words based on several pieces of information linked to a WordNet sense with another bag of words obtained from a Wiktionary entry. |
Extending with non-wordnet data | We calculated a number of similarity scores , the first two based on similarity in the number of lemmas, calculated using the J accard index: |
Extending with non-wordnet data | This development dataset was used to tune refined similarity scores . |
Incremental Topic-Based Adaptation | We define the similarity score as sim(6di, 661*) = 1 — JSD(6di||6d*).1 Thus, we obtain a vector of similarity scores indexed by the training conversations. |
Incremental Topic-Based Adaptation | X —> Y added to the search graph, its topic similarity score as follows: |
Incremental Topic-Based Adaptation | Phrase pairs from the “background conversation” only are assigned a similarity score FX_>y = 0.00. |
Empirical Evaluation | Rex estimates a similarity score for each of the 1,264,827 pairings of comparable terms it finds in the Google 3-grams. |
Related Work and Ideas | Negating the log of this normalized length yields a corresponding similarity score . |
Summary and Conclusions | Using the Google n-grams as a source of tacit grouping constructions, we have created a comprehensive lookup table that provides Rex similarity scores for the most common (if often implicit) comparisons. |
Summary and Conclusions | Comparability is not the same as similarity, and a nonzero similarity score does not mean that two concepts would ever be considered comparable by a human. |
Abstract | Then, for each phrase pair extracted from the training data, we create a vector with features defined in the same way, and calculate its similarity score with the vector representing the dev set. |
Vector space model adaptation | VSM uses the similarity score between the vec- |
Vector space model adaptation | To further improve the similarity score , we apply absolute discounting smoothing when calculating the probability distributions p,( f, e). |
Using the Framework | For each pair of nodes (u,v) in the graph, we compute the semantic similarity score (using WordNet) between every pair of dependency relation (rel: a, b) in u and v as: s(u,v) = Z WN(a,-,aj) >< WN(b,-,bj), |
Using the Framework | WN(w,—, wj) is defined as the WordNet similarity score between words 212,- and to]? |
Using the Framework | For example, the sentences “I adore tennis” and “Everyone likes tennis” convey the same view and should be assigned a higher similarity score as opposed to “I hate tennis”. |
Learning Representation for Contextual Document | In the pre-training stage, Stacked Denoising Auto-encoders are built in an unsupervised layer-wise fashion to discover general concepts encoding d and e. In the supervised fine-tuning stage, the entire network weights are fine-tuned to optimize the similarity score sim(d, e). |
Learning Representation for Contextual Document | The similarity score of (d, 6) pair is defined as the dot product of f (d) and f (6) (Fig. |
Learning Representation for Contextual Document | That is, we raise the similarity score of true pair sim(d, e) and penalize all the rest sim(d, 6,). |
Methodology 2.1 The Problem | All of these high-affinity pairs have a similarity score higher than 0.72. |
Methodology 2.1 The Problem | These two sets of similarity scores are then plotted in a scatter plot, as in Figure 4. |
Methodology 2.1 The Problem | Then, the relation matrix of a bitext is built of similarity scores for the rough translation and the actual translation at sentence level. |
Bayesian MT Decipherment via Hash Sampling | One possible strategy is to compute similarity scores 8(Wfi, we/) between the current source word feature vector Wfi and feature vectors we/Eve for all possible candidates in the target vocabulary. |
Bayesian MT Decipherment via Hash Sampling | Following this, we can prune the translation candidate set by keeping only the top candidates 6* according to the similarity scores . |
Bayesian MT Decipherment via Hash Sampling | This makes the complexity far worse (in practice) since the dimensionality of the feature vectors d is a much higher value than Computing similarity scores alone (nai'vely) would incur O(|Ve| - d) time which is prohibitively huge since we have to do this for every token in the source language corpus. |
Preliminaries | To integrate two similarity scores , we adopt an average as a composite function. |
Preliminaries | We finally compute initial similarity scores for all pairs (6, c) where e 6 V6 and c 6 VC, and build the initial similarity matrix R0. |
Preliminaries | From R”, we finally extract one-to-one matches by using simple greedy approach of three steps: (1) choosing the pair with the highest similarity score ; (2) removing the corresponding row and column from R”; (3) repeating (l) and (2) until the matching score is not less than a threshold 6. |