Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: cassandra-user@incubator.apache.org
Received-SPF: pass (nike.apache.org: domain of
 gcdcu-cassandra-user@m.gmane.org designates 80.91.229.12 as permitted sender)
To: cassandra-user@incubator.apache.org
From: Ted Zlatanov <tzz@lifelogs.com>
Subject: Re: cassandra over hbase
Date: Tue, 24 Nov 2009 08:40:46 -0600
Organization: =?utf-8?B?0KLQtdC+0LTQvtGAINCX0LvQsNGC0LDQvdC+0LI=?= @
 Cienfuegos
Lines: 24
Message-ID: <877htg0yvl.fsf@lifelogs.com>
References: <f40963db0911211550k6823c551g355f448f66f54d4d@mail.gmail.com>
	<OF363C0D26.B37CD2C4-ON88257677.006C1A1B-88257677.006DB2DC@us.ibm.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/23.1.50 (gnu/linux)
Cancel-Lock: sha1:0Z5C/Km3xuQk3g55XyD8RYKbP5w=
Sender: news <news@ger.gmane.org>

On Mon, 23 Nov 2009 11:58:08 -0800 Jun Rao <junrao@almaden.ibm.com> wrote: 

JR> After chatting with some Facebook guys, we realized that one potential
JR> benefit from using HDFS is that the recovery from losing partial data in a
JR> node is more efficient. Suppose that one lost a single disk at a node. HDFS
JR> can quickly rebuild the blocks on the failed disk in parallel. This is a
JR> bit hard to do in cassandra, since we can't easily find the data on the
JR> failed disk from another node. 

This is an architectural issue, right?  IIUC Cassandra simply doesn't
care about disks.  I think that's a plus, actually, because it
simplifies the code and filesystems in my experience are better left up
to the OS.  For instance, we're evaluating Lustre and for many specific
reasons it's significantly better for our needs than HDFS, so HDFS would
be a tough sell.

JR> So, when this happens, the whole node probably has to be taken out
JR> and bootstrapped. The same problem exists when a single sstable file
JR> is corrupted.

I think recovering a single sstable is a useful thing, and it seems like
a better problem to solve.

Ted