zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <teddyyyy...@gmail.com>
Subject Re: Shared block storage via ZooKepper
Date Wed, 13 Jul 2011 18:03:13 GMT
the "high performance" part aside (I would guess that it should follow
the same performance of bookkeeper , which is ~~20kops/sec),
why would there be consistency problems? I assume that BK uses the
same protocol as described in ZAB.  if you mean that a storage node
could be lagging in applying the latest ledger item, so the storage
node state could be stale, yes, but at least that gives us kind of an
eventual consistency model.

On Wed, Jul 13, 2011 at 10:55 AM, Ted Dunning <ted.dunning@gmail.com> wrote:
> This would (roughly) work.  It will not give very high performance and you
> will have consistency problems.
>
> On Wed, Jul 13, 2011 at 10:37 AM, Yang <teddyyyy123@gmail.com> wrote:
>
>> assuming you use option 3) bookkeeper, the following is probably way
>> too over-simplified, but
>> that's the idea:
>>
>> all writers write to Bookkeeper ledger, and each of your actual
>> datastore nodes keeps reading the ledger, each record would be the
>> serialized form of a DB write op,
>> and when the ledger reader reads out the record, it deserializes it,
>> and applies it to the datastore it has, for example, just a mysql, or
>> bdb, or something like the LSM tree used by Cassandra
>> (memtable+sstable).
>>
>> reads to the store directly go to the data store nodes themselves.
>>
>>
>> would this work? that does not sound a lot of work
>>
>> On Wed, Jul 13, 2011 at 3:02 AM, Simon Felix <de@iru.ch> wrote:
>> > Hello everyone
>> >
>> > What is the best way to build a distributed, shared storage system on top
>> of
>> > ZooKeeper? I'm talking about block storage in the terabyte-range (i.e.
>> store
>> > billions of 4k blocks). Consistency and Availability are important, as is
>> > throughput (both read & write). I need at least 50 MB/s with 3 nodes with
>> > two regular SATA drives each for my application.
>> >
>> > Some options I came up with:
>> > 1. Use ZooKeeper directly as a data store (Not recommended according to
>> the
>> > docs - and it really leads to abysmally bad performance, I tested that)
>> > 2. Use Cassandra as data store
>> > 3. Use BookKeeper as write-ahead log and implement my own underlying
>> store
>> > 4. Use ZooKeeper to create my own (probably buggy...) data store
>> >
>> > What would you recommend? Are there other options?
>> >
>> > Cheers,
>> > Simon
>> >
>>
>

Mime
View raw message