Re: [igraph] Negative Values in PageRank

igraph-help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] Negative Values in PageRank

From:	Tamás Nepusz
Subject:	Re: [igraph] Negative Values in PageRank
Date:	Sat, 2 Nov 2013 21:09:12 +0100

Hello,  

> Actually at each computation, I'm interested in the value of a specific node, 
> and the relativeness of this value to the other nodes (I normalize the values 
> to 0-1 scale). What I wanted to understand, is whether the values, even 
> though negative, represent the "Real" picture – I mean, if the relativeness 
> between the nodes is satisfying.  
Unfortunately I cannot say that; if ARPACK indeed failed to converge, then 
there is nothing we can guarantee about the result. ARPACK is sort of a “black 
box” to us as well (that’s why we cannot fix this bug easily) - basically, what 
happens behind the scenes is that we invoke the eigensolver of ARPACK, which 
asks us to do a matrix multiplicaiton every now and then. We are completely 
unaware of what’s going on behind the scenes in ARPACK. If we are lucky, then 
ARPACK returns a converged eigenvector. If we are unlucky, ARPACK failed to 
converge and we are left with a partial result, but we don’t know which parts 
of the vector have converged and which parts have not.

There are multiple things that you can do here:

1) First of all, whenever you invoke the pagerank() method of the graph, check 
whether ARPACK has converged in the end. This can be done by retrieving the 
value of arpack_options.iter, which should return the number of iterations 
ARPACK has taken during the last invocation. If it is equal to or larger than 
arpack_options.maxiter, then there is reason to believe that ARPACK did not 
converge properly - so at least you now have a way of detecting if something 
went wrong.

2) You can also try increasing arpack_options.maxiter; the default value is 
3000, but we have recently found a graph for which it was too small (but ARPACK 
managed to compute the converged eigenvector in ~15000 iterations).

3) We are experimenting with replacing ARPACK with the excellent PRPACK package 
in the near future; PRPACK is specifically designed to compute PageRank scores 
efficiently and it seems to be a lot better in doing that task than relying on 
ARPACK. If you don’t mind compiling the C core of igraph on your own, you 
should check out the 0.7-prpack branch of igraph from Github (see 
https://github.com/igraph/igraph/tree/0.7-prpack) and see whether it works 
better.

> And another question – is there any optimization for running multiple page 
> rank on the same graph, where the graph is growing every time? The running 
> takes a bit more longer than I expected.
I think that one can reasonably expect the PageRank scores of the “old” and 
“new” graph to be close to each other if you are doing only a simple addition 
or removal of nodes during the growth of the graph. If you don’t mind altering 
the C code of igraph, you can tweak the igraph_pagerank function to accept an 
“initial guess” for the PageRank values. ARPACK would then start calculating 
the eigenvector from the initial guess instead of starting from scratch, and 
this could potentially save some time since you could simply supply the old 
PageRank scores as an estimate when you are calculating the PageRank scores of 
the new graph. However, this cannot be done by using the Python interface only, 
you have to dig deeper and alter the C core of igraph.

—  
T.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [igraph] Negative Values in PageRank, Tamás Nepusz, 2013/11/01
- Re: [igraph] Negative Values in PageRank, Tamás Nepusz <=

Prev by Date: [igraph] igraph PageRank Negative Values - Refining my question
Next by Date: [igraph] Unable to install from PPA in Ubuntu 13.04
Previous by thread: Re: [igraph] Negative Values in PageRank
Next by thread: Re: [igraph] distributing a python library that uses igraph 0.6 as dependency
Index(es):
- Date
- Thread