Abstract | We present equivalent formalizations that show CoSimRank’s close relationship to Personalized PageRank and SimRank and also show how we can take advantage of fast matrix multiplication algorithms to compute CoSimRank. |
CoSimRank | We first first give an intuitive introduction of CoSimRank as a Personalized PageRank (PPR) derivative. |
CoSimRank | 3.1 Personalized PageRank |
CoSimRank | Haveliwala (2002) introduced Personalized PageRank — or topic-sensitive PageRank — based on the idea that the uniform damping vector 19(0) can be replaced by a personalized vector, which depends on node i. |
Extensions | The use of weighted edges was first proposed in the PageRank patent. |
Introduction | These algorithms are often based on PageRank (Erin and Page, 1998) and other centrality measures (e.g., (Erkan and Radev, 2004)). |
Introduction | This paper introduces CoSimRank,1 a new graph-theoretic algorithm for computing node similarity that combines features of SimRank and PageRank . |
Related Work | Another important similarity measure is cosine similarity of Personalized PageRank (PPR) vectors. |
Related Work | LexRank (Erkan and Radev, 2004) is similar to PPR+cos in that it combines PageRank and cosine; it initializes the sentence similarity matrix of a document using cosine and then applies PageRank to compute lexical centrality. |
Related Work | These approaches use at least one of cosine similarity, PageRank and SimRank. |
Baselines | Besides, the PageRank algorithm (Page et al., 1998) is adopted to optimize the graph model. |
Baselines | Finally, the weights of word nodes are calculated using the PageRank algorithm as follows: |
Baselines | where d is the damping factor as in the PageRank algorithm. |
Introduction | Besides, the standard PageRank algorithm is employed to optimize the graph model. |
Algorithms | After creating the graph, PageRank is run to rank sentences. |
Algorithms | Finally, instead of PageRank , we used SimRank (Haveliwala, 2002) to identify the nodes most similar to the query node and not only the central sentences in the graph. |
Previous Work | After the graph is generated, the PageRank algorithm (Page et al., 1999) is used to determine the most central linguistic units in the graph. |
Previous Work | PageRank spreads the query similarity of a vertex to its close neighbors, so that we rank higher sentences that are similar to other sentences which are similar to the query. |
Evaluation | We set the damping factor ,u to 0.85, following the standard PageRank paradigm. |
Problem Formulation | The standard PageRank algorithm starts from an arbitrary node and randomly selects to either follow a random outgoing edge (considering the weighted transition matrix) or to jump to a random node (treating all nodes with equal probability). |
Problem Formulation | where 1 is a vector with all elements equaling to l and the size is correspondent to the size of V0 or VT. ,u is the damping factor usually set to 0.85, as in the PageRank algorithm. |