zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Felix ...@iru.ch>
Subject RE: Shared block storage via ZooKepper
Date Wed, 13 Jul 2011 18:01:38 GMT
Could you explain why that is?

What level of performance do you expect?
Why would there be consistency problems?

> -----Original Message-----
> From: Ted Dunning [mailto:ted.dunning@gmail.com]
> Sent: Mittwoch, 13. Juli 2011 19:55
> To: user@zookeeper.apache.org
> Subject: Re: Shared block storage via ZooKepper
> 
> This would (roughly) work.  It will not give very high performance and you
> will have consistency problems.
> 
> On Wed, Jul 13, 2011 at 10:37 AM, Yang <teddyyyy123@gmail.com> wrote:
> 
> > assuming you use option 3) bookkeeper, the following is probably way
> > too over-simplified, but that's the idea:
> >
> > all writers write to Bookkeeper ledger, and each of your actual
> > datastore nodes keeps reading the ledger, each record would be the
> > serialized form of a DB write op, and when the ledger reader reads out
> > the record, it deserializes it, and applies it to the datastore it
> > has, for example, just a mysql, or bdb, or something like the LSM tree
> > used by Cassandra (memtable+sstable).
> >
> > reads to the store directly go to the data store nodes themselves.
> >
> >
> > would this work? that does not sound a lot of work
> >
> > On Wed, Jul 13, 2011 at 3:02 AM, Simon Felix <de@iru.ch> wrote:
> > > Hello everyone
> > >
> > > What is the best way to build a distributed, shared storage system
> > > on top
> > of
> > > ZooKeeper? I'm talking about block storage in the terabyte-range (i.e.
> > store
> > > billions of 4k blocks). Consistency and Availability are important,
> > > as is throughput (both read & write). I need at least 50 MB/s with 3
> > > nodes with two regular SATA drives each for my application.
> > >
> > > Some options I came up with:
> > > 1. Use ZooKeeper directly as a data store (Not recommended according
> > > to
> > the
> > > docs - and it really leads to abysmally bad performance, I tested
> > > that) 2. Use Cassandra as data store 3. Use BookKeeper as
> > > write-ahead log and implement my own underlying
> > store
> > > 4. Use ZooKeeper to create my own (probably buggy...) data store
> > >
> > > What would you recommend? Are there other options?
> > >
> > > Cheers,
> > > Simon
> > >
> >
Mime
View raw message