SciSurf: Index of "cosine similarity" in Proc. ACL 2014

Index of papers in Proc. ACL 2014 that mention

cosine similarity

Seen in text as:

cosine similarity (40)
Cosine similarity (7)

Seen in 47 sentences in 9 papers.

1. Discourse Complements Lexical Semantics for Non-factoid Answer Reranking

Jansen, Peter and Surdeanu, Mihai and Clark, Peter

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach	This is a commonly used setup in the CQA community (Wang et al., 2009).4 Thus, for a given question, all its answers are fetched from the answer collection, and an initial ranking is constructed based on the cosine similarity between theirs and the question’s lemma vector representations, with lemmas weighted using tfidf (Ch.
Approach	The candidate answers are scored using a linear interpolation of two cosine similarity scores: one between the entire parent document and question (to model global context), and a second between the answer candidate and question (for local context).6 Because the number of answer candidates is typically large (e.g., equal to the number of paragraphs in the textbook), we return the N top candidates with the highest scores.
Experiments	The following hyper parameters were tuned using grid search to maximize P@1 on each development partition: (a) the segment matching thresholds that determine the minimum cosine similarity between an answer segment and a question for the segment to be labeled QSEG; and (b)
Models and Features	6We empirically observed that this combination of scores performs better than using solely the cosine similarity between the answer and question.
Models and Features	If text before or after a marker out to a given sentence range matches the entire text of the question (with a cosine similarity score larger than a threshold), that argument takes on the label QSEG, or OTHER otherwise.
Models and Features	The values of the discourse features are the mean of the similarity scores (e. g., cosine similarity using tfidf weighting) of the two marker arguments and the corresponding question.
Related Work	(2011) extracted 47 cue phrases such as because from a small collection of web documents, and used the cosine similarity between an answer candidate and a bag of words containing these cue phrases as a single feature in their reranking model for non-factoid why QA.

cosine similarity is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

2. Distributed Representations of Geographically Situated Language

Bamman, David and Dyer, Chris and Smith, Noah A.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation	To illustrate how the model described above can learn geographically-informed semantic representations of words, table 1 displays the terms with the highest cosine similarity to wicked in Kansas and Massachusetts after running our joint model on the full 1.1 billion words of Twitter data; while wicked in Kansas is close to other evaluative terms like evil and pure and religious terms like gods and spirit, in Massachusetts it is most similar to other intensifiers like super, ridiculously and insanely.
Evaluation	Table 2 likewise presents the terms with the highest cosine similarity to city in both California and New York; while the terms most evoked by city in California include regional locations like Chinatown, Los Angeles’ South Bay and San Francisco’s East Bay, in New York the most similar terms include hamptons, upstate and borough
Evaluation	Table 1: Terms with the highest cosine similarity to wicked in Kansas and Massachusetts.

cosine similarity is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

3. Two-Stage Hashing for Fast Document Retrieval

Li, Hao and Liu, Wei and Ji, Heng

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Document Retrieval with Hashing	Given a query document vector q, we use the Cosine similarity measure to evaluate the similarity between q and a document a: in a dataset:
Document Retrieval with Hashing	However, such a brute-force search does not scale to massive datasets since the search time complexity for each query is 0(n); additionally, the computational cost spent on Cosine similarity calculation is also nontrivial.
Document Retrieval with Hashing	(1) It tries to preserve the Cosine similarity of the original data with a probabilistic guarantee (Charikar, 2002).
Experiments	We first evaluate the quality of term vectors and ITQ binary codes by conducting the whole list Cosine similarity ranking and hamming distance ranking, respectively.
Experiments	For each query document, the top-K candidate documents with highest Cosine similarity scores and shortest hamming distances are returned, then we calculate the average precision for each K. Fig.
Introduction	for example, one can store 250 million documents with 1.9G memory using only 64 bits for each document while a large news corpus such as the English Gigaword fifth edition1 stores 10 million documents in a 26G hard drive; 2) the time efficiency of manipulating binary codes, for example, computing the hamming distance between a pair of binary codes is several orders of magnitude faster than computing the real-valued cosine similarity over a pair of document vectors.

cosine similarity is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

4. CoSimRank: A Flexible & Efficient Graph-Theoretic Similarity Measure

Rothe, Sascha and Schütze, Hinrich

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

CoSimRank	This is similar to cosine similarity except that the l-norm is used instead of the 2-norm.
Extensions	We are not including this method in our experiments, but we will give the equation here, as traditional document similarity measures (e.g., cosine similarity ) perform poorly on this task although there also are known alternatives with good results (Sahami and Heilman, 2006).
Extensions	To calculate PPR+cos, we computed 20 iterations with a decay factor of 0.8 and used the cosine similarity with the 2-norm in the denominator to compare two vectors.
Extensions	We compute 20 iterations of PPR+cos to reach convergence and then calculate a single cosine similarity .
Related Work	Another important similarity measure is cosine similarity of Personalized PageRank (PPR) vectors.
Related Work	These approaches use at least one of cosine similarity , PageRank and SimRank.

cosine similarity is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

5. Graph Ranking for Collective Named Entity Disambiguation

Alhelbawy, Ayman and Gaizauskas, Robert

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Solution Graph	cos: The cosine similarity between the named entity textual mention and the KB entry title.
Solution Graph	ijim: While the cosine similarity between a textual mention in the document and the candidate
Solution Graph	The cosine similarity between “Essex” and “Danbury, Essex” is higher than that between “Essex” and “Essex County Cricket Club”, which is not helpful in the NED setting.

cosine similarity is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

6. Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries

Mehdad, Yashar and Carenini, Giuseppe and Ng, Raymond T.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	1) Cosine- 1 st: we rank the utterances in the chat log based on the cosine similarity between the utterance and query.
Experimental Setup	2) Cosine-all: we rank the utterances in the chat log based on the cosine similarity between the utterance and query and then select the utterances with a cosine similarity greater than 0;
Experimental Setup	Query Relevance: another interesting observation is that relying only on the cosine similarity (i.e., cosine-all) to measure the query relevance presents a quite strong baseline.
Phrasal Query Abstraction Framework	We use the K-mean clustering algorithm by cosine similarity as a distance function between sentence vectors composed of tfidf scores.

cosine similarity is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

7. Lexical Inference over Multi-Word Predicates: A Distributional Approach

Abend, Omri and Cohen, Shay B. and Steedman, Mark

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion	One of the most effective similarity measures is the cosine similarity , which is a normalized dot product.
Discussion	Indeed, taking Hp as above, and cosine similarity as the only feature (i.e., w E R), yields the distribution
Our Proposal: A Latent LC Approach	These measures give complementary perspectives on the similarity between the predicates, as the cosine similarity is symmetric between the LHS and RHS predicates, while BInc takes into account the directionality of the inference relation.

cosine similarity is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

8. Shallow Analysis Based Assessment of Syntactic Complexity for Automated Speech Scoring

Bhat, Suma and Xue, Huichao and Yoon, Su-Youn

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Models for Measuring Grammatical Competence	The similarity between a test response and a score-specific vector is then calculated by a cosine similarity metric.
Models for Measuring Grammatical Competence	Although a total of 4 cosine similarity scores (one per score group) were generated, only 0034from among the four similarity scores, and cosmazc,
Models for Measuring Grammatical Competence	0 0034: the cosine similarity score between the test response and the vector of POS bigrams for the highest score class (level 4); and,

cosine similarity is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

9. Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia

Daxenberger, Johannes and Gurevych, Iryna

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Machine Learning with Edit-Turn-Pairs	We used the cosine similarity , longest common subsequence, and word n- gram similarity measures.
Machine Learning with Edit-Turn-Pairs	Cosine similarity was applied on binary weighted term vectors (L2 norm).
Machine Learning with Edit-Turn-Pairs	Cosine similarity , longest common subsequence, and word n-gram similarity were also applied to measure the similarity between the edit comment and the turn text as well as the similarity between the edit comment and the turn topic name.

cosine similarity is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: