zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Carey <paul.p.ca...@gmail.com>
Subject Re: Transaction logs and hierarchical quorums
Date Mon, 28 Aug 2017 23:28:33 GMT
Just bumping this thread. I'm still interested to know if I'm
misunderstanding expected behaviour, or if something else is happening.
Thanks.

On Mon, 21 Aug 2017 at 10:18 Paul Carey <paul.p.carey@gmail.com> wrote:

> Hi
>
> In order to simplify handling of data center failover, I wanted to create
> a ZooKeeper ensemble where writes were synchronously replicated to at least
> one node in each DC before returning to the client.
>
> Quoting from the section on hierarchical quorums [1] that
>   "we are able to form a quorum once we have a majority of votes from a
> majority of non-zero-weight groups"
> I understand from this that if I have 2 non-zero-weight groups of 3 nodes,
> then a quorum must be formed from 2 groups of at least 2 nodes. Which is at
> least 4 nodes and hence at least node in each DC must be part of quorum,
> thus ensuring each write is replicated to at least one node in each DC.
>
> I also understand from this line in the Programmer's Guide [2], and
> various other places in the docs that the transaction log will reflect
> every change applied to the znode tree.
>   "The most performance-critical part of ZooKeeper is the transaction log.
> ZooKeeper must sync transactions to media before it returns a response. "
>
> Given the two points above, I would expect to see every zxid in at least
> four transaction logs. But under failure conditions, this is not what I
> see. I used `tc netem` to simulate a network split by progressively:
>   - increasing inter-DC latency from an average of 0.7ms (the DCs are 30km
> apart) to 10ms
>   - dropping 50% of packets between DCs
> I see the last zxid before total failure of the ensemble, 0x1b0002dfa5, in
> only 3 of the 6 transaction logs, suggesting to me that the hierarchical
> quroum was not correctly established before the write was accepted.
>
> But maybe I'm misunderstanding:
>   - maybe the presence of an entry in the transaction log is not the same
> as saying that change will be applied to the in-memory state
>   - the zxids refer to createSession, perhaps quorum rules are not
> enforced for such calls
>
> Anyway, I'd be very grateful if someone could help me understand what I'm
> seeing here. Log snippets and config follow below. I'm running ZooKeeper
> 3.4.6 on RHEL 6.8.
>
> Many thanks
>
> Paul
>
> == Transaction Logs ==
>
> Host 3a
> 8/17/17 6:01:44 AM UTC session 0x35dee3497560039 cxid 0x0 zxid
> 0x1b0002dfa5 createSession 10000
> 8/17/17 6:02:13 AM UTC session 0x35deec8e21b0000 cxid 0x0 zxid
> 0x1c00000001 createSession 10000
>
> Host 3b
> 8/17/17 6:00:54 AM GMT session 0x15dee2f698e000d cxid 0x0 zxid
> 0x1b0002df44 closeSession null
> EOF reached after 10556 txns.
>
> Host 4a
> 8/17/17 6:01:44 AM GMT session 0x35dee3497560039 cxid 0x0 zxid
> 0x1b0002dfa5 createSession 10000
> 8/17/17 6:02:13 AM GMT session 0x35deec8e21b0000 cxid 0x0 zxid
> 0x1c00000001 createSession 10000
>
> Host 4b
> 8/17/17 6:01:25 AM GMT session 0x35dee3497560032 cxid 0x0 zxid
> 0x1b0002df88 createSession 10000
> 8/17/17 6:02:13 AM GMT session 0x35deec8e21b0000 cxid 0x0 zxid
> 0x1c00000001 createSession 10000
> EOF reached after 6619 txns.
>
> Host 7a
> 8/17/17 6:01:44 AM GMT session 0x35dee3497560039 cxid 0x0 zxid
> 0x1b0002dfa5 createSession 10000
> 8/17/17 6:02:13 AM GMT session 0x35deec8e21b0000 cxid 0x0 zxid
> 0x1c00000001 createSession 10000
>
> Host 7b
> 8/17/17 6:01:25 AM GMT session 0x35dee3497560032 cxid 0x0 zxid
> 0x1b0002df88 createSession 10000
> EOF reached after 49071 txns.
>
>
> == Config ==
>
> server.1=3a:2888:3888
> server.2=3b:2888:3888
> server.3=4a:2888:3888
> server.4=4b:2888:3888
> server.5=7a:2888:3888
> server.6=7b:2888:3888
>
> group.1=1:2:4
> group.2=3:5:6
>
> weight.1=1
> weight.2=1
> weight.3=1
> weight.4=1
> weight.5=1
> weight.6=1
>
> [1]
> http://zookeeper.apache.org/doc/r3.4.6/zookeeperHierarchicalQuorums.html
> [2] http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message