zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Creating a znode with SEQUENTIAL_EPHEMERAL mode becomes corrupt in case of unstable network
Date Wed, 21 Sep 2011 14:30:47 GMT
If you cannot tolerate this sort of situation, then the only solution is
typically to avoid sequential ephemerals.  The problem is that in the
presence of a flaky network you cannot always tell if a failed create
actually created the znode in question.  This is because the network may
have failed after the create succeeded, but before you got the result.  In
that case, since this is a sequential ephemeral, you can't know if your file
got created because you don't even know the name.  Moreover, scanning
doesn't help because if you could scan, you probably could have used a fixed
unique name in the first place.

There is a very long standing proposed (nearly complete) solution for this
that requires some difficult coding.  See

2011/9/21 Fournier, Camille F. <Camille.Fournier@gs.com>

> This is expected. In cases where the network becomes unstable, it is the
> responsibility of the client writer to handle disconnected events
> appropriately and check to verify whether nodes they tried to write around
> the time of these events did or did not succeed. It makes writing a
> "Generic" client for ZK very difficult (search the mailing list for zkclient
> and you'll read a bunch of convos around this topic). Fortunately, many
> things that rely on EPHEMERAL_SEQUENTIAL nodes can tolerate some duplication
> of data, so often it's not a huge problem.
> C
> -----Original Message-----
> From: 박영근(Alex) [mailto:alex.park@nexr.com]
> Sent: Wednesday, September 21, 2011 9:16 AM
> To: dev@zookeeper.apache.org
> Cc: user@zookeeper.apache.org
> Subject: Creating a znode with SEQUENTIAL_EPHEMERAL mode becomes corrupt in
> case of unstable network
> Hi, All
> I met a problem in creating a znode with SEQUENTIAL_EPHEMERAL mode under
> unstable network condition.
> While a client did not receive a message that a sequential node was
> created,
> the ensemble has the znode, which is checked at zookeeper dashboard(
> https://github.com/phunt/zookeeper_dashboard).
> If the client receives a DISCONNECTED event, it tries to reconnect.
> Session timeout is 30 seconds.
> Unstable network condition is made as the following:
> The grinder agent sends a request of creating a znode of
> ZK ensemble has three servers.
> Each NIC of server is down and up repeatedly;
> NIC of server1 become down every one minute and sleeps for 9 seconds, then
> up
> NIC of server2 become down every 2 minute and sleeps for 9 seconds, then up
> NIC of server3 become down every 3 minute and sleeps for 9 seconds, then up
> Is there any idea or related issue?
> Thanks in advance.
> Alex

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message