helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Varun Sharma <va...@pinterest.com>
Subject Re: RoutingTableProvider dropping callbacks
Date Sat, 07 Mar 2015 23:56:53 GMT
Here is the stack trace - there is a zookeeper race and the detailed stack
trace appears for bucketized resources. I saw that the ideal state for the
resource was created on 26th Feb and was modified on 7th March. However,
the external view for the resource is showing up as created on 7th march as
well as modified on 7th march. The external view is created at 10:36:04 on
7th march which is 20 seconds after this log message stack trace is spit
out. After this the routing table provider no longer receives any more zk
callbacks.

2015-03-07 10:35:43,735 [main-EventThread] (ZkAsyncCallbacks.java:127)
WARN
org.apache.helix.manager.zk.ZkAsyncCallbacks$SetDataCallbackHandler@3c8589f0,
rc:NONODE, path:
/main_a/EXTERNALVIEW/$terrapin$data$visual_seo_joins_staging$1422384697040

2015-03-07 10:35:43,736 [main-EventThread] (ZkAsyncCallbacks.java:127)
WARN
org.apache.helix.manager.zk.ZkAsyncCallbacks$SetDataCallbackHandler@63230a9a,
rc:NONODE, path:
/main_a/EXTERNALVIEW/$terrapin$data$recommendation_p2p_exp_candset_1$1425671237739

2015-03-07 10:35:43,736 [main-EventThread] (ZkAsyncCallbacks.java:127)
WARN
org.apache.helix.manager.zk.ZkAsyncCallbacks$SetDataCallbackHandler@118d374f,
rc:NONODE, path: /main_a/EXTERNALVIEW/$terrapin$data$None$1422308641250

2015-03-07 10:35:43,736 [ZkClient-EventThread-17-terrapinzk001a:2181]
(CallbackHandler.java:304) WARN  fail to subscribe child/data change. path:
/main_a/EXTERNALVIEW, listener:
com.pinterest.terrapin.controller.TerrapinRoutingTableProvider@2c6691da

*org.I0Itec.zkclient.exception.ZkNoNodeException:
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
NoNode for /main_a/EXTERNALVIEW/$terrapin$data$None$1422308641250*

        at
org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)

        at
org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)

        at
org.apache.helix.manager.zk.ZkClient.getChildren(ZkClient.java:210)

        at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:409)

        at
org.apache.helix.manager.zk.CallbackHandler.subscribeForChanges(CallbackHandler.java:279)

        at
org.apache.helix.manager.zk.CallbackHandler.invoke(CallbackHandler.java:202)

        at
org.apache.helix.manager.zk.CallbackHandler.handleChildChange(CallbackHandler.java:391)

        at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:570)

        at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)

Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for
/main_a/EXTERNALVIEW/$terrapin$data$None$1422308641250

        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
  at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)

        at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1249)

 2015-03-07 10:35:43,848 [ZkClient-EventThread-17-terrapinzk001a:2181]
(RoutingTableProvider.java:99) INFO  *Resetting* the routing table.

On Thu, Mar 5, 2015 at 11:33 AM, Varun Sharma <varun@pinterest.com> wrote:

> I suspect the callbacks are not coming in, for a long time now.
>
> On Thu, Mar 5, 2015 at 11:30 AM, Varun Sharma <varun@pinterest.com> wrote:
>
>> I grepped this and found nothing:
>>
>> sudo grep START:INVOKE.*EXTERNALVIEW /var/log/terrapin/controller.log*
>>
>> I found a bunch of START:INVOKE for the IDEALSTATES znode though.
>>
>> On Thu, Mar 5, 2015 at 11:15 AM, Zhen Zhang <zzhang@linkedin.com> wrote:
>>
>>>  Yes. you should see a pair of "START:INVOKE..." and "END:INVOKE:..."
>>> for each callback in your log.
>>> ------------------------------
>>> *From:* Varun Sharma [varun@pinterest.com]
>>> *Sent:* Thursday, March 05, 2015 11:11 AM
>>> *To:* user@helix.apache.org
>>> *Subject:* Re: RoutingTableProvider dropping callbacks
>>>
>>>   Ohk - is there a way to confirm that the callbacks are being
>>> processed (from the logs etc.) ?
>>>
>>> On Thu, Mar 5, 2015 at 10:50 AM, Zhen Zhang <zzhang@linkedin.com> wrote:
>>>
>>>>  Hi Varun,
>>>>
>>>>  This should not be a problem. When we register a callback, we are
>>>> expecting a call back type of INIT first, followed by a sequence of
>>>> CALLBACK types, and when you unregister the callback, you will received a
>>>> FINALIZED type. Since unregister is an async operation, when you receive
a
>>>> FINALIZED type, you might still see a couple of CALLBACK type callbacks,
>>>> which are simply ignored. The log is basically telling you that.
>>>>
>>>>  Thanks,
>>>> Jason
>>>>  ------------------------------
>>>> *From:* Varun Sharma [varun@pinterest.com]
>>>> *Sent:* Thursday, March 05, 2015 10:44 AM
>>>> *To:* user@helix.apache.org
>>>> *Subject:* RoutingTableProvider dropping callbacks
>>>>
>>>>    Hi,
>>>>
>>>>  It seems that the RoutingTableProvider is dropping callbacks in our
>>>> case. Here is a log:
>>>>
>>>>  [ZkClient-EventThread-17-terrapinzk001a:2181]
>>>> (CallbackHandler.java:130) WARN  Skip processing callbacks for listener:
>>>> com.pinterest.terrapin.controller.TerrapinRoutingTableProvider@7e7f8062,
>>>> path: /main_a/EXTERNALVIEW, expected types: [INIT] but was CALLBACK
>>>>
>>>>
>>>>  We have a custom RoutingTableProvider to catch callbacks and do some
>>>> processing - this is causing a lot of issues for us. What  could be causing
>>>> this ?
>>>>
>>>>  Thanks
>>>> Varun
>>>>
>>>
>>>
>>
>

Mime
View raw message