[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [igraph] igraph in large dataset
From: |
Tamas Nepusz |
Subject: |
Re: [igraph] igraph in large dataset |
Date: |
Tue, 17 Nov 2009 20:41:25 +0000 |
Hi,
> The dataset consists of 60,000 nodes and more than 1M edges.
> I want to get some basic graph properties (i.e. diameter, component,
> clustering coefficient...)
> The program is running on a laptop (1G RAM). It has been running more than 12
> hours. It is still running.....
> I have tested another dataset, which consists of 60,000 nodes and 100,000
> edges, it took about 20 minutes to finish.
> I am wondering whether Igraph can handle such large dataset (60,000 nodes and
> 1M edges). and if it can, how long it needs to finish.
Yes, igraph is able to handle such large datasets, so that shouldn't be the
problem. In fact, I'm not surprised that it didn't finish within 12 hours, and
I would blame it primarily on the clustering coefficient calculation. The
clustering coefficient can be calculated in O(|V| * d^2) time where |V| is the
number of vertices and d is the average degree. I will make a simple assumption
that your graph with 1M edges has an average degree ten times as much as your
other graph with 100K edges. This means that igraph will probably need a
hundred times as much time to calculate the clustering coefficient for the
graph with 1M edges than for the graph with 100K edges. Assuming that the 20
minutes used for the graph with 100K edges was almost completely used for the
clustering coefficient calculation, you can expect the calculation for the
large graph to finish in approximately 2000 minutes = 33 hours. Maybe more,
maybe less.
--
Tamas