zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tsuyoshi OZAWA <ozawa.tsuyo...@gmail.com>
Subject Re: [announce] Accord: A high-performance coordination service for write-intensive workloads
Date Mon, 26 Sep 2011 02:18:20 GMT
On Sun, Sep 25, 2011 at 8:14 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
> Zookeeper should have one hard drive for logging and one for snapshots to
> avoid seeks.

Yes, I used two HDD. The one is dedicated for logging, the other is
dedicated for snapshot.

>>  2- The previous observation leads me to the next question: could you say
>>> more about your use of disk with persistence on?
>>>
>> ZooKeeper returns ACK after writing the disks of the over half machines.
>> Accord returns ACK after writing the disk of just one machine, which
>> accepted a request. However, at the same time, the ACK assures that all
>> servers receive the messages in the same order.
>>
>
> It is a bit of an open question about just how hard one should push
> durability.  I believe that Volt, for instance commits when enough servers
> confirm that they have queued up the log entry, but they don't wait for the
> logging to complete.  Since the log writer can have very high throughput,
> this allows some very high throughput rates at the cost of some risk of
> regression if you lose power to all servers exactly simultaneously.  Even
> with a blown circuit breaker, the power supply holdup time is commonly
> enough to flush a moderate amount of disk buffers (30ms or more).  If you
> can stop committing instantly when power drops, it may be pretty safe.  If
> you have any UPS with a power loss warning, then you are probably quite
> safe.  If you are OK with in-order time slippage on distastrous power loss
> then you should be fine.

Yeah, this is the tradeoff between the fault-tolerance and the performance.

One proposal is the pluggable strorage layer for ZooKeeper.
It works like MySQL pluggable storage layer.

The users who needs fault-tolerance use the storage and messaging
engine of ZooKeeper,
while the users who needs the performance use these of Accord.
The users of ZooKeeper can select a choice of the semantics for their
use case by using this.

> The difference of the semantics means that this measurement is not fair.
>> I would like to measure the under fair situation, but not yet. If there are
>> requests from users, I'm going to implement it and measure it. Note that the
>> benchmark of in-memory is fair.
>
> The in-memory throughput for Zookeeper looks about like the disk version
> should look.

The benchmark is measured with ZooKeeper on /dev/shm device.
Is there the implementation of ZooKpeer in-memory mode?
If the answer is positive, I'll benchmark with it.
-- 
OZAWA Tsuyoshi <ozawa.tsuyoshi@gmail.com>

Mime
View raw message