helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Varun Sharma <va...@pinterest.com>
Subject Re: Excessive ZooKeeper load
Date Fri, 06 Feb 2015 04:48:33 GMT
I see - one more thing - there was talk of a batching mode where Helix can
batch updates - can it batch multiple updates  to the external view and
write once into zookeeper instead of writing for every update. For example,
consider the case when lots of partitions are being onlined - if we could
batch updates to the external view into batches of 100 ? Is that supported
in Helix 0.6.4

Thanks !
Varun

On Thu, Feb 5, 2015 at 5:23 PM, Zhen Zhang <zzhang@linkedin.com> wrote:

>  Yes. the listener will be notified on add/delete/modify. You can
> distinguish if you have a local cache and compare to get the delta.
> Currently the API doesn't expose this.
>
>  ------------------------------
> *From:* Varun Sharma [varun@pinterest.com]
> *Sent:* Thursday, February 05, 2015 1:53 PM
>
> *To:* user@helix.apache.org
> *Subject:* Re: Excessive ZooKeeper load
>
>   I assume that it also gets called when external views get modified ?
> How can i distinguish if there was an Add, a modify or a delete ?
>
>  Thanks
> Varun
>
> On Thu, Feb 5, 2015 at 9:27 AM, Zhen Zhang <zzhang@linkedin.com> wrote:
>
>>  Yes. It will get invoked when external views are added or deleted.
>>  ------------------------------
>> *From:* Varun Sharma [varun@pinterest.com]
>> *Sent:* Thursday, February 05, 2015 1:27 AM
>>
>> *To:* user@helix.apache.org
>> *Subject:* Re: Excessive ZooKeeper load
>>
>>    I had another question - does the RoutingTableProvider
>> onExternalViewChange call get invoked when a resource gets deleted (and
>> hence its external view znode) ?
>>
>> On Wed, Feb 4, 2015 at 10:54 PM, Zhen Zhang <zzhang@linkedin.com> wrote:
>>
>>>  Yes. I think we did this in the incubating stage or even before. It's
>>> probably in a separate branch for some performance evaluation.
>>>
>>>  ------------------------------
>>> *From:* kishore g [g.kishore@gmail.com]
>>> *Sent:* Wednesday, February 04, 2015 9:54 PM
>>>
>>> *To:* user@helix.apache.org
>>> *Subject:* Re: Excessive ZooKeeper load
>>>
>>>    Jason, I remember having the ability to compress/decompress and
>>> before we added the support to bucketize, compression was used to support
>>> large number of partitions. However I dont see the code anywhere. Did we do
>>> this on a separate branch?
>>>
>>>  thanks,
>>> Kishore G
>>>
>>> On Wed, Feb 4, 2015 at 3:30 PM, Zhen Zhang <zzhang@linkedin.com> wrote:
>>>
>>>>  Hi Varun, we can certainly add compression and have a config for
>>>> turning it on/off. We do have implemented compression in our own zkclient
>>>> before. The issue for compression might be:
>>>> 1) cpu consumption on controller will increase.
>>>> 2) hard to debug
>>>>
>>>>  Thanks,
>>>> Jason
>>>>  ------------------------------
>>>> *From:* kishore g [g.kishore@gmail.com]
>>>> *Sent:* Wednesday, February 04, 2015 3:08 PM
>>>>
>>>> *To:* user@helix.apache.org
>>>> *Subject:* Re: Excessive ZooKeeper load
>>>>
>>>>    we do have the ability to compress the data. I am not sure if there
>>>> is a easy way to turn on/off the compression.
>>>>
>>>> On Wed, Feb 4, 2015 at 2:49 PM, Varun Sharma <varun@pinterest.com>
>>>> wrote:
>>>>
>>>>> I am wondering if its possible to gzip the external view znode - a
>>>>> simple gzip cut down the data size by 25X. Is it possible to plug in
>>>>> compression/decompression as zookeeper nodes are read ?
>>>>>
>>>>>  Varun
>>>>>
>>>>> On Mon, Feb 2, 2015 at 8:53 PM, kishore g <g.kishore@gmail.com>
wrote:
>>>>>
>>>>>> There are multiple options we can try here.
>>>>>> what if we used cacheddataaccessor for this use case?.clients will
>>>>>> only read if node has changed. This optimization can benefit all
use cases.
>>>>>>
>>>>>> What about batching the watch triggers. Not sure which version of
>>>>>> helix has this option.
>>>>>>
>>>>>> Another option is to use a poll based roundtable instead of watch
>>>>>> based. This can coupled with cacheddataaccessor can be over efficient.
>>>>>>
>>>>>> Thanks,
>>>>>> Kishore G
>>>>>>  On Feb 2, 2015 8:17 PM, "Varun Sharma" <varun@pinterest.com>
wrote:
>>>>>>
>>>>>>> My total external view across all resources is roughly 3M in
size
>>>>>>> and there are 100 clients downloading it twice for every node
restart -
>>>>>>> thats 600M of data for every restart. So I guess that is causing
this
>>>>>>> issue. We are thinking of doing some tricks to limit the # of
clients to 1
>>>>>>> from 100. I guess that should help significantly.
>>>>>>>
>>>>>>>  Varun
>>>>>>>
>>>>>>> On Mon, Feb 2, 2015 at 7:37 PM, Zhen Zhang <zzhang@linkedin.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>  Hey Varun,
>>>>>>>>
>>>>>>>>  I guess your external view is pretty large, since each external
>>>>>>>> view callback takes ~3s. The RoutingTableProvider is callback
>>>>>>>> based, so only when there is a change in the external view,
>>>>>>>> RoutingTableProvider will read the entire external view from
ZK. During the
>>>>>>>> rolling upgrade, there are lots of live instance change,
which may lead to
>>>>>>>> a lot of changes in the external view. One possible way to
mitigate the
>>>>>>>> issue is to smooth the traffic by having some delays in between
bouncing
>>>>>>>> nodes. We can do a rough estimation on how many external
view changes you
>>>>>>>> might have during the upgrade, how many listeners you have,
and how large
>>>>>>>> is the external views. Once we have these numbers, we might
know the ZK
>>>>>>>> bandwidth requirement. ZK read bandwidth can be scaled by
adding ZK
>>>>>>>> observers.
>>>>>>>>
>>>>>>>>  ZK watcher is one time only, so every time a listener receives
a
>>>>>>>> callback, it will re-register its watcher again to ZK.
>>>>>>>>
>>>>>>>>  It's normally unreliable to depend on delta changes instead
of
>>>>>>>> reading the entire znode. There might be some corner cases
where you would
>>>>>>>> lose delta changes if you depend on that.
>>>>>>>>
>>>>>>>>  For the ZK connection issue, do you have any log on the
ZK server
>>>>>>>> side regarding this connection?
>>>>>>>>
>>>>>>>>  Thanks,
>>>>>>>> Jason
>>>>>>>>
>>>>>>>>   ------------------------------
>>>>>>>> *From:* Varun Sharma [varun@pinterest.com]
>>>>>>>> *Sent:* Monday, February 02, 2015 4:41 PM
>>>>>>>> *To:* user@helix.apache.org
>>>>>>>> *Subject:* Re: Excessive ZooKeeper load
>>>>>>>>
>>>>>>>>    I believe there is a misbehaving client. Here is a stack
trace
>>>>>>>> - it probably lost connection and is now stampeding it:
>>>>>>>>
>>>>>>>>  "ZkClient-EventThread-104-terrapinzk001a:2181,terrapinzk
>>>>>>>> 002b:2181,terrapinzk003e:2181" daemon prio=10
>>>>>>>> tid=0x00007f534144b800 nid=0x7db5 in Object.wait() [0x00007f52ca9c3000]
>>>>>>>>
>>>>>>>>    java.lang.Thread.State: WAITING (on object monitor)
>>>>>>>>
>>>>>>>>         at java.lang.Object.wait(Native Method)
>>>>>>>>
>>>>>>>>         at java.lang.Object.wait(Object.java:503)
>>>>>>>>
>>>>>>>>         at
>>>>>>>> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
>>>>>>>>
>>>>>>>>         - locked <0x00000004fb0d8c38> (a
>>>>>>>> org.apache.zookeeper.ClientCnxn$Packet)
>>>>>>>>
>>>>>>>>         at
>>>>>>>> org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1036)
>>>>>>>>
>>>>>>>>         at
>>>>>>>> org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
>>>>>>>>
>>>>>>>>         at org.I0Itec.zk
>>>>>>>> client.ZkConnection.exists(ZkConnection.java:95)
>>>>>>>>
>>>>>>>>         at org.I0Itec.zkclient.ZkClient$11.call(ZkClient.java:823)
>>>>>>>>
>>>>>>>> *        at
>>>>>>>> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)*
>>>>>>>>
>>>>>>>> *        at
>>>>>>>> org.I0Itec.zkclient.ZkClient.watchForData(ZkClient.java:820)*
>>>>>>>>
>>>>>>>> *        at
>>>>>>>> org.I0Itec.zkclient.ZkClient.subscribeDataChanges(ZkClient.java:136)*
>>>>>>>>
>>>>>>>>         at org.apache.helix.manager.zk
>>>>>>>> .CallbackHandler.subscribeDataChange(CallbackHandler.java:241)
>>>>>>>>
>>>>>>>>         at org.apache.helix.manager.zk
>>>>>>>> .CallbackHandler.subscribeForChanges(CallbackHandler.java:287)
>>>>>>>>
>>>>>>>>         at org.apache.helix.manager.zk
>>>>>>>> .CallbackHandler.invoke(CallbackHandler.java:202)
>>>>>>>>
>>>>>>>>         - locked <0x000000056b75a948> (a org.apache.helix.manager.
>>>>>>>> zk.ZKHelixManager)
>>>>>>>>
>>>>>>>>         at org.apache.helix.manager.zk
>>>>>>>> .CallbackHandler.handleDataChange(CallbackHandler.java:338)
>>>>>>>>
>>>>>>>>         at org.I0Itec.zkclient.ZkClient$6.run(ZkClient.java:547)
>>>>>>>>
>>>>>>>>         at org.I0Itec.zk
>>>>>>>> client.ZkEventThread.run(ZkEventThread.java:71)
>>>>>>>>
>>>>>>>> On Mon, Feb 2, 2015 at 4:28 PM, Varun Sharma <varun@pinterest.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I am wondering what is causing the zk subscription to
happen every
>>>>>>>>> 2-3 seconds - is this a new watch being established every
3 seconds ?
>>>>>>>>>
>>>>>>>>>  Thanks
>>>>>>>>>  Varun
>>>>>>>>>
>>>>>>>>> On Mon, Feb 2, 2015 at 4:23 PM, Varun Sharma <varun@pinterest.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>>  We are serving a few different resources whose total
# of
>>>>>>>>>> partitions is ~ 30K. We just did a rolling restart
fo the cluster and the
>>>>>>>>>> clients which use the RoutingTableProvider are stuck
in a bad state where
>>>>>>>>>> they are constantly subscribing to changes in the
external view of a
>>>>>>>>>> cluster. Here is the helix log on the client after
our rolling restart was
>>>>>>>>>> finished - the client is constantly polling ZK. The
zookeeper node is
>>>>>>>>>> pushing 300mbps right now and most of the traffic
is being pulled by
>>>>>>>>>> clients. Is this a race condition - also is there
an easy way to make the
>>>>>>>>>> clients not poll so aggressively. We restarted one
of the clients and we
>>>>>>>>>> don't see these same messages anymore. Also is it
possible to just
>>>>>>>>>> propagate external view diffs instead of the whole
big znode ?
>>>>>>>>>>
>>>>>>>>>>  15/02/03 00:21:18 INFO zk.CallbackHandler: 104 END:INVOKE
>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider
Took: 3340ms
>>>>>>>>>>
>>>>>>>>>> 15/02/03 00:21:18 INFO zk.CallbackHandler: 104 START:INVOKE
>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider
>>>>>>>>>>
>>>>>>>>>> 15/02/03 00:21:18 INFO zk.CallbackHandler: pinacle2084
subscribes
>>>>>>>>>> child-change. path: /main_a/EXTERNALVIEW, listener:
>>>>>>>>>> org.apache.helix.spectator.RoutingTableProvider@76984879
>>>>>>>>>>
>>>>>>>>>> 15/02/03 00:21:22 INFO zk.CallbackHandler: 104 END:INVOKE
>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider
Took: 3371ms
>>>>>>>>>>
>>>>>>>>>> 15/02/03 00:21:22 INFO zk.CallbackHandler: 104 START:INVOKE
>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider
>>>>>>>>>>
>>>>>>>>>> 15/02/03 00:21:22 INFO zk.CallbackHandler: pinacle2084
subscribes
>>>>>>>>>> child-change. path: /main_a/EXTERNALVIEW, listener:
>>>>>>>>>> org.apache.helix.spectator.RoutingTableProvider@76984879
>>>>>>>>>>
>>>>>>>>>> 15/02/03 00:21:25 INFO zk.CallbackHandler: 104 END:INVOKE
>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider
Took: 3281ms
>>>>>>>>>>
>>>>>>>>>> 15/02/03 00:21:25 INFO zk.CallbackHandler: 104 START:INVOKE
>>>>>>>>>> /main_a/EXTERNALVIEW
>>>>>>>>>> listener:org.apache.helix.spectator.RoutingTableProvider
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message