igraph-help
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] Efficiently selecting a node in python igraph


From: Tamás Nepusz
Subject: Re: [igraph] Efficiently selecting a node in python igraph
Date: Sat, 10 May 2014 21:57:20 +0200

Hi,

> - find node object based on keyword attribute
Okay, this is a bit undocumented and must be improved for the next version, but 
here’s the trick: the “name” vertex attribute is indexed g.vs.find() is the 
right way to go but keep in mind that it has O(n) complexity since node 
attributes are not indexed -- except the “name” attribute, which is indexed. 
igraph will take advantage of this if you invoke g.vs.find() with a positional 
argument only *and* the type of this positional argument is string. (Integers 
won’t work because g.vs.find(X) will return the node with internal ID equal to 
X). So, this is what I have on my machine:

In [1]: g=Graph.Barabasi(100000)
In [2]: g.vs["name"] = map(str, range(100000))
In [3]: g.vs["another_attribute"] = map(str, range(100000))
In [4]: %timeit g.vs.find(another_attribute="99999")
100 loops, best of 3: 17.8 ms per loop
In [5]: %timeit g.vs.find(name="99999")
100 loops, best of 3: 18.6 ms per loop

but:

In [6]: %timeit g.vs.find("99999")
100000 loops, best of 3: 2.13 us per loop

The first issue here is that this behaviour is undocumented; the second is that 
it should work even if you specify g.vs.find(name=“99999”) since the two 
operations are essentially equivalent. An additional advantage of using a 
“name” attribute with type string is that you can use these names wherever 
igraph expects a node or node ID, _without_ having to look up the node 
explicitly using find(). However, using the “name” attribute comes at a cost 
since its internal index has to be invalidated every time you modify the graph 
and then rebuilt the next time you try to look up a node by name.

> - add new node to graph (add_vertex) [since this returns none I need to
> "find" the new node obj]
Here you can take advantage of the fact that the newly added node always 
receives the first available integer node ID, so its ID will be g.vcount()-1 
after the addition. This eliminates the need for .find() since you can simply 
use g.add_edge(paper_id_node, g.vcount()-1).

> Currently this is very slow. Any tips?
As Gabor said: igraph’s data structure is optimized for static graphs, so 
adding edges one by one is slower than adding them in batches. Maybe it is 
faster to do something like this:

id_gen = UniqueIdGenerator()
edgelist = []
for paper_id_1, paper_id_2 in whatever_generator_generates_your_edges:
    edgelist.append((id_gen[paper_id_1], id_gen[paper_id_2]))
g = Graph(edgelist)
g.vs[“name”] = id_gen.values()

-- 
T.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]