helix-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "subramanian raghunathan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HELIX-652) Double assignment , when participant is not able to establish connection with zookeeper quorum
Date Thu, 26 Jan 2017 20:04:24 GMT

    [ https://issues.apache.org/jira/browse/HELIX-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840342#comment-15840342

subramanian raghunathan commented on HELIX-652:

Thoughts/Inputs from Kishore:
Helix can handle this and probably should. Couple of challenges here are
1.	How to generalize this across all use cases. This is a trade-off between availability and
ensuring there is only one leader per partition. 
2.	There is a pathological case where all zookeeper nodes get partitioned/crash/GC. In this
case, we will make all participants disconnect and assume they don't own the partition. But
when zookeepers come out of GC, it can continue as if nothing happened i.e it does not account
for the time when its down. I can't think of a good solution for this scenario. Moreover,
we cannot differentiate between a participant GC'ing/partitioned v/s ZK ensemble crash/partition/GC.
This is typically avoided by ensuring ZK servers are deployed on different racks.
Having said that, I think implementing a config based solution is worth it. 

> Double assignment , when participant is not able to establish connection with zookeeper
> ----------------------------------------------------------------------------------------------
>                 Key: HELIX-652
>                 URL: https://issues.apache.org/jira/browse/HELIX-652
>             Project: Apache Helix
>          Issue Type: Bug
>          Components: helix-core
>    Affects Versions: 0.7.1, 0.6.4
>            Reporter: subramanian raghunathan
> Double assignment , when participant is not able to establish connection with zookeeper
> Following is the  set up. 
> Version(s) :               Helix: 0.7.1
>                                 Zookeeper:3.3.4
> - State Model: OnlineOffline 
> - Controller (leader elected from one of the cluster nodes)
> - Single resources with partitions.
> - Full auto rebalancer
> -Zookeeper quorum (3 nodes)
> When one participant loses the zookeeper connection (It’s not able to connect to any
of the zookeepers , a typical occurrence we faced was switch failure from that rack or a network
switch failure on a node) 
>   ---- >  The partition (P1) for which this participant (say Node N1) is online is
still maintained
> Meanwhile since it loses the ephemeral  node in zookeeper , the rebalancer gets triggered
and it reallocates the partition (P1) to another participant node (say Node N2) to become
online  @ time T1
>                 ---- >  After this both N1 and N2 are acting as online for the same
Partition (P1) 
> But as soon as participant in (say Node N1) is able to re-establish the zookeeper connection
 @ time T2
>                 ---- >  Reset gets called on the partition in participant (say Node
> Double assignment: 
> The question here is this an expected behavior that both nodes N1 and N2 could be online
for the same Partition (P1) between time (T1-T2) 

This message was sent by Atlassian JIRA

View raw message