helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhen Zhang <zzh...@linkedin.com>
Subject RE: Excessive ZooKeeper load
Date Fri, 06 Feb 2015 01:23:23 GMT
Yes. the listener will be notified on add/delete/modify. You can distinguish if you have a
local cache and compare to get the delta. Currently the API doesn't expose this.

________________________________
From: Varun Sharma [varun@pinterest.com]
Sent: Thursday, February 05, 2015 1:53 PM
To: user@helix.apache.org
Subject: Re: Excessive ZooKeeper load

I assume that it also gets called when external views get modified ? How can i distinguish
if there was an Add, a modify or a delete ?

Thanks
Varun

On Thu, Feb 5, 2015 at 9:27 AM, Zhen Zhang <zzhang@linkedin.com<mailto:zzhang@linkedin.com>>
wrote:
Yes. It will get invoked when external views are added or deleted.
________________________________
From: Varun Sharma [varun@pinterest.com<mailto:varun@pinterest.com>]
Sent: Thursday, February 05, 2015 1:27 AM

To: user@helix.apache.org<mailto:user@helix.apache.org>
Subject: Re: Excessive ZooKeeper load

I had another question - does the RoutingTableProvider onExternalViewChange call get invoked
when a resource gets deleted (and hence its external view znode) ?

On Wed, Feb 4, 2015 at 10:54 PM, Zhen Zhang <zzhang@linkedin.com<mailto:zzhang@linkedin.com>>
wrote:
Yes. I think we did this in the incubating stage or even before. It's probably in a separate
branch for some performance evaluation.

________________________________
From: kishore g [g.kishore@gmail.com<mailto:g.kishore@gmail.com>]
Sent: Wednesday, February 04, 2015 9:54 PM

To: user@helix.apache.org<mailto:user@helix.apache.org>
Subject: Re: Excessive ZooKeeper load

Jason, I remember having the ability to compress/decompress and  before we added the support
to bucketize, compression was used to support large number of partitions. However I dont see
the code anywhere. Did we do this on a separate branch?

thanks,
Kishore G

On Wed, Feb 4, 2015 at 3:30 PM, Zhen Zhang <zzhang@linkedin.com<mailto:zzhang@linkedin.com>>
wrote:
Hi Varun, we can certainly add compression and have a config for turning it on/off. We do
have implemented compression in our own zkclient before. The issue for compression might be:
1) cpu consumption on controller will increase.
2) hard to debug

Thanks,
Jason
________________________________
From: kishore g [g.kishore@gmail.com<mailto:g.kishore@gmail.com>]
Sent: Wednesday, February 04, 2015 3:08 PM

To: user@helix.apache.org<mailto:user@helix.apache.org>
Subject: Re: Excessive ZooKeeper load

we do have the ability to compress the data. I am not sure if there is a easy way to turn
on/off the compression.

On Wed, Feb 4, 2015 at 2:49 PM, Varun Sharma <varun@pinterest.com<mailto:varun@pinterest.com>>
wrote:
I am wondering if its possible to gzip the external view znode - a simple gzip cut down the
data size by 25X. Is it possible to plug in compression/decompression as zookeeper nodes are
read ?

Varun

On Mon, Feb 2, 2015 at 8:53 PM, kishore g <g.kishore@gmail.com<mailto:g.kishore@gmail.com>>
wrote:

There are multiple options we can try here.
what if we used cacheddataaccessor for this use case?.clients will only read if node has changed.
This optimization can benefit all use cases.

What about batching the watch triggers. Not sure which version of helix has this option.

Another option is to use a poll based roundtable instead of watch based. This can coupled
with cacheddataaccessor can be over efficient.

Thanks,
Kishore G

On Feb 2, 2015 8:17 PM, "Varun Sharma" <varun@pinterest.com<mailto:varun@pinterest.com>>
wrote:
My total external view across all resources is roughly 3M in size and there are 100 clients
downloading it twice for every node restart - thats 600M of data for every restart. So I guess
that is causing this issue. We are thinking of doing some tricks to limit the # of clients
to 1 from 100. I guess that should help significantly.

Varun

On Mon, Feb 2, 2015 at 7:37 PM, Zhen Zhang <zzhang@linkedin.com<mailto:zzhang@linkedin.com>>
wrote:
Hey Varun,

I guess your external view is pretty large, since each external view callback takes ~3s. The
RoutingTableProvider is callback based, so only when there is a change in the external view,
RoutingTableProvider will read the entire external view from ZK. During the rolling upgrade,
there are lots of live instance change, which may lead to a lot of changes in the external
view. One possible way to mitigate the issue is to smooth the traffic by having some delays
in between bouncing nodes. We can do a rough estimation on how many external view changes
you might have during the upgrade, how many listeners you have, and how large is the external
views. Once we have these numbers, we might know the ZK bandwidth requirement. ZK read bandwidth
can be scaled by adding ZK observers.

ZK watcher is one time only, so every time a listener receives a callback, it will re-register
its watcher again to ZK.

It's normally unreliable to depend on delta changes instead of reading the entire znode. There
might be some corner cases where you would lose delta changes if you depend on that.

For the ZK connection issue, do you have any log on the ZK server side regarding this connection?

Thanks,
Jason

________________________________
From: Varun Sharma [varun@pinterest.com<mailto:varun@pinterest.com>]
Sent: Monday, February 02, 2015 4:41 PM
To: user@helix.apache.org<mailto:user@helix.apache.org>
Subject: Re: Excessive ZooKeeper load

I believe there is a misbehaving client. Here is a stack trace - it probably lost connection
and is now stampeding it:


"ZkClient-EventThread-104-terrapinzk001a:2181,terrapinzk002b:2181,terrapinzk003e:2181" daemon
prio=10 tid=0x00007f534144b800 nid=0x7db5 in Object.wait() [0x00007f52ca9c3000]

   java.lang.Thread.State: WAITING (on object monitor)

        at java.lang.Object.wait(Native Method)

        at java.lang.Object.wait(Object.java:503)

        at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)

        - locked <0x00000004fb0d8c38> (a org.apache.zookeeper.ClientCnxn$Packet)

        at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1036)

        at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)

        at org.I0Itec.zkclient.ZkConnection.exists(ZkConnection.java:95)

        at org.I0Itec.zkclient.ZkClient$11.call(ZkClient.java:823)

        at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)

        at org.I0Itec.zkclient.ZkClient.watchForData(ZkClient.java:820)

        at org.I0Itec.zkclient.ZkClient.subscribeDataChanges(ZkClient.java:136)

        at org.apache.helix.manager.zk.CallbackHandler.subscribeDataChange(CallbackHandler.java:241)

        at org.apache.helix.manager.zk.CallbackHandler.subscribeForChanges(CallbackHandler.java:287)

        at org.apache.helix.manager.zk.CallbackHandler.invoke(CallbackHandler.java:202)

        - locked <0x000000056b75a948> (a org.apache.helix.manager.zk.ZKHelixManager)

        at org.apache.helix.manager.zk.CallbackHandler.handleDataChange(CallbackHandler.java:338)

        at org.I0Itec.zkclient.ZkClient$6.run(ZkClient.java:547)

        at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)

On Mon, Feb 2, 2015 at 4:28 PM, Varun Sharma <varun@pinterest.com<mailto:varun@pinterest.com>>
wrote:
I am wondering what is causing the zk subscription to happen every 2-3 seconds - is this a
new watch being established every 3 seconds ?

Thanks
Varun

On Mon, Feb 2, 2015 at 4:23 PM, Varun Sharma <varun@pinterest.com<mailto:varun@pinterest.com>>
wrote:
Hi,

We are serving a few different resources whose total # of partitions is ~ 30K. We just did
a rolling restart fo the cluster and the clients which use the RoutingTableProvider are stuck
in a bad state where they are constantly subscribing to changes in the external view of a
cluster. Here is the helix log on the client after our rolling restart was finished - the
client is constantly polling ZK. The zookeeper node is pushing 300mbps right now and most
of the traffic is being pulled by clients. Is this a race condition - also is there an easy
way to make the clients not poll so aggressively. We restarted one of the clients and we don't
see these same messages anymore. Also is it possible to just propagate external view diffs
instead of the whole big znode ?

15/02/03 00:21:18 INFO zk.CallbackHandler: 104 END:INVOKE /main_a/EXTERNALVIEW listener:org.apache.helix.spectator.RoutingTableProvider
Took: 3340ms

15/02/03 00:21:18 INFO zk.CallbackHandler: 104 START:INVOKE /main_a/EXTERNALVIEW listener:org.apache.helix.spectator.RoutingTableProvider

15/02/03 00:21:18 INFO zk.CallbackHandler: pinacle2084 subscribes child-change. path: /main_a/EXTERNALVIEW,
listener: org.apache.helix.spectator.RoutingTableProvider@76984879

15/02/03 00:21:22 INFO zk.CallbackHandler: 104 END:INVOKE /main_a/EXTERNALVIEW listener:org.apache.helix.spectator.RoutingTableProvider
Took: 3371ms

15/02/03 00:21:22 INFO zk.CallbackHandler: 104 START:INVOKE /main_a/EXTERNALVIEW listener:org.apache.helix.spectator.RoutingTableProvider

15/02/03 00:21:22 INFO zk.CallbackHandler: pinacle2084 subscribes child-change. path: /main_a/EXTERNALVIEW,
listener: org.apache.helix.spectator.RoutingTableProvider@76984879

15/02/03 00:21:25 INFO zk.CallbackHandler: 104 END:INVOKE /main_a/EXTERNALVIEW listener:org.apache.helix.spectator.RoutingTableProvider
Took: 3281ms

15/02/03 00:21:25 INFO zk.CallbackHandler: 104 START:INVOKE /main_a/EXTERNALVIEW listener:org.apache.helix.spectator.RoutingTableProvider











Mime
View raw message