hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chad Walters <Chad.Walt...@microsoft.com>
Subject Re: ZK rethink?
Date Sun, 12 Apr 2009 14:43:19 GMT
My understanding is that Quantcast was running both HDFS and KFS in parallel since they didn't
 fully trust either one. Can anyone confirm or deny this? Have they switched over fully to
using KFS?

KFS seemed interesting but, given that more development effort is directed at HDFS, it didn't
seem worth pursuing. Of course, as you point out, HDFS has been slow/resistant to implementing
some features important for HBase that KFS supports out of the box.

As things currently stand, it is unlikely that folks from the Powerset team will lead any
investigation into KFS. However, others are welcome to spend some time digging into it and
seeing how promising the prospect is.

I believe that KFS has an implementation of the Hadoop File System interface. It seems like
someone who was interested and motivated could hook up KFS under HBase that way to do some
basic testing. Perhaps some benchmarking with KFS would turn up some numbers that could be
used to spur some changes at HDFS...

Doing deeper integration with KFS that avoided the Hadoop FileSystem abstraction seems like
it would be a mistake (but I could be convinced otherwise). Proposing possible extensions
to the Hadoop FileSystem interface might be more fruitful.


On 4/11/09 10:21 PM, "Ryan Rawson" <ryanobjc@gmail.com> wrote:

Yes quantcast is sponsoring (ie: hired him) to work on kfs.  Given hdfs's
disinterest in supporting features that are essential to hbase, we need to
do the right thing and recommend a solution that scales at the small end
(wouldn't it be nice not to mess with "xceiver" [sic] settings?) And the big
end, and to close the write hole?

Obviously kfs isn't for everyone, but we could recommend it to newbies and
potentially put and end to the parade of "I can't run on 3 nodes" newbie

On Apr 11, 2009 9:54 PM, "Andrew Purtell" <apurtell@apache.org> wrote:

I have heard that Quantcast is running a 700 node KFS
cluster and is sponsoring Sriram Rao as a full time
employee now.

I'm also looking at GlusterFS. GlusterFS is interesting
in that it is completely decentralized. The clients
determine how the data is stored on the network. I'm
investigating how one might distribute configs (and
updates, when bricks are added or decommissioned) in
ZK. I have a small cluster up and running and it works
well enough to support Hadoop/mapreduce and HBase on
top of it. Performance and reliability seem better
than HDFS, but this is not quantified yet.

- Andy > From: Ryan Rawson > Subject: Re: ZK rethink?
> Date: Saturday, April 11, 2009, 7:50 PM

> > Hdfs causes us so many scalability and data loss issues. > I'm
personally looking into kfs? As i...

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message