zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: [announce] Accord: A high-performance coordination service for write-intensive workloads
Date Sun, 25 Sep 2011 11:14:13 GMT
On Sun, Sep 25, 2011 at 12:02 AM, OZAWA Tsuyoshi <
ozawa.tsuyoshi@lab.ntt.co.jp> wrote:

> ... 1- I was wondering if you can give more detail on the setup you used to
>> generate the numbers you show in the graphs on your Accord page. The
>> ZooKeeper values are way too low, and I suspect that you're using a
>> single hard drive. It could be because you expect to use a single hard
>> drive with an Accord server, and you wanted to make the comparison fair.
>> Is this correct?
> No, it isn't.
> Both ZooKeeper and Accord use the dedicated hard drive for logging.

Zookeeper should have one hard drive for logging and one for snapshots to
avoid seeks.

>  2- The previous observation leads me to the next question: could you say
>> more about your use of disk with persistence on?
> ZooKeeper returns ACK after writing the disks of the over half machines.
> Accord returns ACK after writing the disk of just one machine, which
> accepted a request. However, at the same time, the ACK assures that all
> servers receive the messages in the same order.

It is a bit of an open question about just how hard one should push
durability.  I believe that Volt, for instance commits when enough servers
confirm that they have queued up the log entry, but they don't wait for the
logging to complete.  Since the log writer can have very high throughput,
this allows some very high throughput rates at the cost of some risk of
regression if you lose power to all servers exactly simultaneously.  Even
with a blown circuit breaker, the power supply holdup time is commonly
enough to flush a moderate amount of disk buffers (30ms or more).  If you
can stop committing instantly when power drops, it may be pretty safe.  If
you have any UPS with a power loss warning, then you are probably quite
safe.  If you are OK with in-order time slippage on distastrous power loss
then you should be fine.

The difference of the semantics means that this measurement is not fair.
> I would like to measure the under fair situation, but not yet. If there are
> requests from users, I'm going to implement it and measure it. Note that the
> benchmark of in-memory is fair.

The in-memory throughput for Zookeeper looks about like the disk version
should look.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message