zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neha Narkhede <neha.narkh...@gmail.com>
Subject Re: watchers not fired after a disk failure?
Date Mon, 05 Mar 2012 23:09:56 GMT
Jeremy,

>> * This session is able to successfully create ephemeral znodes, but watches never
fire for the session.  For example, when the session has a children watch set on /election/a,
and then creates /election/a/a_00000001, its watch on /election/a never fires (but it does
fire for sessions coming from other clients).

Have you tried running wchc on your zookeeper servers to see what
paths the session ids are watching ?

Thanks,
Neha

On Mon, Mar 5, 2012 at 2:47 PM, Jeremy Stribling <strib@nicira.com> wrote:
> I have been investigating an issue at one of our customers (our product
> embeds Zookeeper in it), and here's a summary of what I've been able to
> discern:
>
> * One of the servers in a 3-server ZK cluster (ZK 3.3.3 with some patches, C
> client) experiences a hardware/firmware failure of its RAID partition
> (possibly affecting the network card as well).  From /var/log/syslog:
>
> -------------------------------
> Feb 24 01:53:54 controller3 kernel: [881855.148384] megaraid_sas
> 0000:03:00.0: vpd r/w failed.  This is likely a firmware bug on this device.
>  Contact the card vendor for a firmware update.
> Feb 24 01:53:54 controller3 kernel: [1398904.161148] bnx2 0000:01:00.0: irq
> 73 for MSI/MSI-X
> Feb 24 01:53:54 controller3 kernel: [1398904.161157] bnx2 0000:01:00.0: irq
> 74 for MSI/MSI-X
> Feb 24 01:53:54 controller3 kernel: [1398904.161165] bnx2 0000:01:00.0: irq
> 75 for MSI/MSI-X
> Feb 24 01:53:54 controller3 kernel: [1398904.161173] bnx2 0000:01:00.0: irq
> 76 for MSI/MSI-X
> Feb 24 01:53:54 controller3 kernel: [1398904.161181] bnx2 0000:01:00.0: irq
> 77 for MSI/MSI-X
> Feb 24 01:53:54 controller3 kernel: [1398904.161188] bnx2 0000:01:00.0: irq
> 78 for MSI/MSI-X
> Feb 24 01:53:54 controller3 kernel: [1398904.161196] bnx2 0000:01:00.0: irq
> 79 for MSI/MSI-X
> Feb 24 01:53:54 controller3 kernel: [1398904.161203] bnx2 0000:01:00.0: irq
> 80 for MSI/MSI-X
> Feb 24 01:53:54 controller3 kernel: [1398904.161210] bnx2 0000:01:00.0: irq
> 81 for MSI/MSI-X
> Feb 24 01:53:54 controller3 kernel: [1398904.241931] bnx2 0000:01:00.0:
> eth0: using MSIX
> Feb 24 01:53:54 controller3 kernel: [1398904.243302] ADDRCONF(NETDEV_UP):
> eth0: link is not ready
> Feb 24 01:53:57 controller3 kernel: [1398907.309739] bnx2 0000:01:00.0:
> eth0: NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow
> control ON
> Feb 24 01:53:57 controller3 kernel: [1398907.311239]
> ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> Feb 24 01:55:06 controller3 ntpdate[10764]: step time server 72.3.128.241
> offset 63.134559 sec
> Feb 24 01:55:06 controller3 collectd[1691]: uc_update: Value too old: name =
> controller/disk-sda/disk_octets; value time = 1330048506; last cache update
> = 1330048506;
> Feb 24 01:55:06 controller3 collectd[1691]: Filter subsystem: Built-in
> target `write': Dispatching value to all write plugins failed with status
> -1.
> Feb 24 01:55:06 controller3 collectd[1691]: uc_update: Value too old: name =
> controller/disk-sda/disk_ops; value time = 1330048506; last cache update =
> 1330048506;
> Feb 24 01:55:06 controller3 collectd[1691]: Filter subsystem: Built-in
> target `write': Dispatching value to all write plugins failed with status
> -1.
> ...
> Feb 24 01:55:06 controller3 collectd[1691]: Filter subsystem: Built-in
> target `write': Dispatching value to all write plugins failed with status
> -1.
> Feb 24 01:55:10 controller3 kernel: [1398917.517473] eth0: no IPv6 routers
> present
> Feb 24 01:55:56 controller3 kernel: [1398962.889915] bnx2 0000:01:00.0: irq
> 73 for MSI/MSI-X
> Feb 24 01:55:56 controller3 kernel: [1398962.889924] bnx2 0000:01:00.0: irq
> 74 for MSI/MSI-X
> Feb 24 01:55:56 controller3 kernel: [1398962.889932] bnx2 0000:01:00.0: irq
> 75 for MSI/MSI-X
> Feb 24 01:55:56 controller3 kernel: [1398962.889939] bnx2 0000:01:00.0: irq
> 76 for MSI/MSI-X
> Feb 24 01:55:56 controller3 kernel: [1398962.889946] bnx2 0000:01:00.0: irq
> 77 for MSI/MSI-X
> Feb 24 01:55:56 controller3 kernel: [1398962.889953] bnx2 0000:01:00.0: irq
> 78 for MSI/MSI-X
> Feb 24 01:55:56 controller3 kernel: [1398962.889960] bnx2 0000:01:00.0: irq
> 79 for MSI/MSI-X
> Feb 24 01:55:56 controller3 kernel: [1398962.889968] bnx2 0000:01:00.0: irq
> 80 for MSI/MSI-X
> Feb 24 01:55:56 controller3 kernel: [1398962.889975] bnx2 0000:01:00.0: irq
> 81 for MSI/MSI-X
> Feb 24 01:55:56 controller3 kernel: [1398962.970592] bnx2 0000:01:00.0:
> eth0: using MSIX
> Feb 24 01:55:56 controller3 kernel: [1398962.971959] ADDRCONF(NETDEV_UP):
> eth0: link is not ready
> Feb 24 01:55:59 controller3 kernel: [1398966.028629] bnx2 0000:01:00.0:
> eth0: NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow
> control ON
> Feb 24 01:55:59 controller3 kernel: [1398966.030132]
> ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> Feb 24 01:56:05 controller3 ntpdate[10968]: step time server 72.3.128.241
> offset 0.002667 sec
> -------------------------------
>
> * This leads existing ZK client sessions to expire, and those clients
> attempt to make new sessions.
> * One client (on a different node) establishes a new session to this server
> at 01:55:56 (coinciding exactly with the second set of IRQ messages shown
> above):
>
> -------------------------------
> 2012-02-24 01:55:56,964:17928(0x7fbf0b077700):ZOO_INFO@check_events@1632:
> session establishment complete on server [10.14.237.70:2888],
> sessionId=0xd235ad12815a0000, negotiated timeout=6000
> -------------------------------
>
> * This session is able to successfully create ephemeral znodes, but watches
> never fire for the session.  For example, when the session has a children
> watch set on /election/a, and then creates /election/a/a_00000001, its watch
> on /election/a never fires (but it does fire for sessions coming from other
> clients).
>
> Unfortunately I cannot reproduce this, and we don't have good logs.  Because
> of this, I cannot definitively place the blame on Zookeeper -- there are
> layers of our product above ZK that could theoretically be the problem.  But
> given the evidence I've looked at, ZK seems to be the likely culprit.
>
> So my question is: is anyone aware of a failure scenario that would cause
> watches to not be fired, but allow other operations from the same session to
> be successful?  And are there any workarounds?  I don't expect much help on
> this from the community, given the lack of logs and evidence, but I just
> wanted to throw it out there and see if anyone had any ideas off the top of
> their head.  Thanks,
>
> Jeremy
>

Mime
View raw message