helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: Messages building up in helix
Date Mon, 28 Nov 2016 20:52:18 GMT
Looks like nodes add and remove themselves quite often. After you disable
the instance, Helix will send messages to go from ONLINE to OFFLINE. Looks
like the nodes shut down before they get those messages and when they come
back up, they use a different instance id.

There are two solutions
- During shut down - after disabling wait for the state to be reflected in
the External View.
- During start up - If possible, re-join the cluster with the same name. If
you do that, Helix will remove old messages.

A third option is to support autoCleanUp in Helix. Helix controller can
monitor the cluster for dead nodes and remove them automatically after some
time.



On Mon, Nov 28, 2016 at 12:39 PM, Sesh Jalagam <sjalagam@box.com> wrote:

> <clustername>/INSTANCES/INSTANCES/MESSAGES has already read messages.
>
> Here is an example.
>     ,"FROM_STATE":"ONLINE"
>     ,"MSG_STATE":"read"
>     ,"MSG_TYPE":"STATE_TRANSITION"
>     ,"STATE_MODEL_DEF":"OnlineOffline"
>     ,"STATE_MODEL_FACTORY_NAME":"DEFAULT"
>     ,"TO_STATE":"OFFLINE
>
> I see these messages after the participant is disabled and dropped i.e
> <clustername>/INSTANCES/<PARTICIPANT_ID> is removed.
>
> Thanks
>
>
> On Mon, Nov 28, 2016 at 12:18 PM, kishore g <g.kishore@gmail.com> wrote:
>
>> <clustername>/INSTANCES/INSTANCES/MESSAGES by this do you mean
>> <clustername>/INSTANCES/<PARTICIPANT_ID>/MESSAGES
>>
>> What kind of messages do you see under these nodes.
>>
>>
>>
>> On Mon, Nov 28, 2016 at 12:04 PM, Sesh Jalagam <sjalagam@box.com> wrote:
>>
>>> Our set up is following.
>>>
>>> - Controller (leader elected from one of the cluster nodes)
>>>
>>> - Cluster of nodes as participants in OnlineOffline StateModel
>>>
>>> - Set of resources with partitions.
>>>
>>>
>>> Each node on its startup, creates a controller adds a participant if its
>>> not existing and waits for the callbacks to handle partition rebalancing.
>>>
>>> Please not this cluster is created on the fly multiple times a day
>>> (actual cluster is not deleted, but new participants are removed and
>>> re-added)
>>>
>>>
>>> Everything works fine in production, but I see that the znodes
>>> in <clustername>/INSTANCES/INSTANCES/MESSAGES is growing.
>>>
>>> What is <cluster_id>/INSTANCES/INSTANCES used for, is there a way for
>>> the messages to be deleted automatically.
>>>
>>> I see similar buildup in <cluster_id>INSTANCES/INSTANCES/CURRENTSTATES.
>>>
>>>
>>> Thanks
>>> --
>>> - Sesh .J
>>>
>>
>>
>
>
> --
> - Sesh .J
>

Mime
View raw message