[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [igraph] parallel processing
From: |
Tamás Nepusz |
Subject: |
Re: [igraph] parallel processing |
Date: |
Tue, 9 Oct 2012 20:53:18 +0200 |
> It was a simplification of my program. I have a data set of network with
> quite 43,000,000 edges stored in a special unusual format. I myself should
> make an empty graph and append edges. the process is very high and It takes
> too long if I want to do sequentially.
>
The problem lies not within doing it sequentially. The problem is that the R
interface of igraph does not _modify_ the graph in-place when you add edges; it
creates a _copy_ of the graph and adds the edges to the copy instead. (That's
why you have to write g <- g + edges(whatever)). Copying the graph is
expensive, especially if you add the edges one by one. The solution is simple:
add your edges in batches; for instance, you can start reading your file and
construct the edge list in a simple R vector. When you reach 1 million edges,
you add all of them at once to g, clear your vector and continue reading the
file. I don't know whether R has an internal limit on the length of vectors; if
it doesn't, you can simply read all your 43,000,000 edges into a long vector of
86,000,000 numbers (two for each edge) and then construct the graph at once by
a single call to the graph() constructor.
Cheers,
Tamas