[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] scalability vs. namespace

From: Onyx
Subject: Re: [Gluster-devel] scalability vs. namespace
Date: Sat, 08 Dec 2007 08:52:03 +0100
User-agent: Thunderbird (Windows/20071031)

I'd like to know the answer to this one also.
We plan to use glusterfs for an online backup service where there will be a lot of small files. I'd like to know some numbers of the theoretical limits on the number of files in a glusterfs cluster (depending on 32/64 bit os, used filesystem?), and if there are any other limiting things to consider to setup a practical/usable cluster with a lot of files.

Petr Kacer wrote:

  first of all, THANK YOU for the work you do! :-)

  GlusterFS looks very promising and I would like to try and use it in
my cluster setup. However, what is not entirely clear to me (even after
reading the docs and maillist archives) is how well (if at all) does it
scale in terms of the total number of stored files? Just thinking in the
long run... it might not be anything like an issue today but may as well
be tomorrow.

  I know Unify does a great job distributing the data across all bricks.
So does Stripe, should the files be larger. But let's suppose I want to
(theoretically) store a lot (e.g. a billion) of files or better yet,
suppose the number of stored files grows with time and I have to keep
them all. What will become the bottleneck?

  Adding more bricks is definitely possible so the total _size_ of files
is not an issue per se; glusterfs design seems to be able to cope with
that just fine. Well, not the same with the total _number_ of files - if
I got it right (please correct me if not) then each and every brick has
to have enough space to host the entire directory tree structure and
"worse" yet, the namespace brick has to be able to store as many inodes
as there are the files in the whole cluster (including directory
structure), making it the possible weak link of such a setup, from the
total number of files point of view.
  Is there currently a way around this (other than getting a larger
drive and/or using a different FS for the namespace brick, which merely
gets you some time but does not really scale)? Or is the namespace meant
to be "just a kind of cache" where some "garbage collection" might
eventually be performed in the future?



Gluster-devel mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]