From Yang <teddyyyy...@gmail.com>
Subject Re: Shared block storage via ZooKepper
Date Wed, 13 Jul 2011 17:37:39 GMT
assuming you use option 3) bookkeeper, the following is probably way
too over-simplified, but
that's the idea:

all writers write to Bookkeeper ledger, and each of your actual
datastore nodes keeps reading the ledger, each record would be the
serialized form of a DB write op,
and when the ledger reader reads out the record, it deserializes it,
and applies it to the datastore it has, for example, just a mysql, or
bdb, or something like the LSM tree used by Cassandra

reads to the store directly go to the data store nodes themselves.

would this work? that does not sound a lot of work

On Wed, Jul 13, 2011 at 3:02 AM, Simon Felix <de@iru.ch> wrote:
> Hello everyone
> What is the best way to build a distributed, shared storage system on top of
> ZooKeeper? I'm talking about block storage in the terabyte-range (i.e. store
> billions of 4k blocks). Consistency and Availability are important, as is
> throughput (both read & write). I need at least 50 MB/s with 3 nodes with
> two regular SATA drives each for my application.
> Some options I came up with:
> 1. Use ZooKeeper directly as a data store (Not recommended according to the
> docs - and it really leads to abysmally bad performance, I tested that)
> 2. Use Cassandra as data store
> 3. Use BookKeeper as write-ahead log and implement my own underlying store
> 4. Use ZooKeeper to create my own (probably buggy...) data store
> What would you recommend? Are there other options?
> Cheers,
> Simon

