Mailing-List: contact commits-help@helix.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@helix.apache.org
Date: Wed, 12 Nov 2014 08:54:33 +0000 (UTC)
From: "Hudson (JIRA)" <jira@apache.org>
To: commits@helix.incubator.apache.org
Message-ID: <JIRA.12754126.1415622048000.478134.1415782473913@Atlassian.JIRA>
In-Reply-To: <JIRA.12754126.1415622048000@Atlassian.JIRA>
References: <JIRA.12754126.1415622048000@Atlassian.JIRA>
 <JIRA.12754126.1415622048228@arcas>
Subject: [jira] [Commented] (HELIX-543) Single partition unnecessarily moved
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/HELIX-543?page=3Dcom.atlassian.=
jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D14207=
834#comment-14207834 ]=20

Hudson commented on HELIX-543:
------------------------------

UNSTABLE: Integrated in helix #1303 (See [https://builds.apache.org/job/hel=
ix/1303/])
HELIX-543 RB-27808 Avoid moving partitions unnecessarily when auto-rebalanc=
ing (g.kishore: rev dc9f129b67f8cacdf0cd22288f166b56fc5654a0)
* helix-core/src/main/java/org/apache/helix/controller/strategy/AutoRebalan=
ceStrategy.java
* helix-agent/helix-agent-0.7.2-SNAPSHOT.ivy
* helix-core/src/test/java/org/apache/helix/integration/SinglePartitionLead=
erStandByTest.java
* helix-core/src/test/java/org/apache/helix/controller/strategy/TestAutoReb=
alanceStrategy.java


> Single partition unnecessarily moved
> ------------------------------------
>
>                 Key: HELIX-543
>                 URL: https://issues.apache.org/jira/browse/HELIX-543
>             Project: Apache Helix
>          Issue Type: Bug
>          Components: helix-core
>    Affects Versions: 0.7.1, 0.6.4
>            Reporter: Tom Widmer
>            Assignee: kishore gopalakrishna
>            Priority: Minor
>
> (Copied from mailing list)
> I have some resources that I use with the OnlineOffine state but which on=
ly have a single partition at the moment (essentially, Helix is just giving=
 me a simple leader election to decide who controls the resource - I don=E2=
=80=99t care which participant has it, as long as only one does). However, =
with full auto rebalance, I find that the =E2=80=98first=E2=80=99 instance =
(alphabetically I think) always gets the resource when it=E2=80=99s up. So =
if I take down the first node so the partition transfers to the 2nd node, t=
hen bring back up the 1st node, the resource transfers back unnecessarily.
> Note that this issue also affects multi-partition resources, it=E2=80=99s=
 just a bit less noticeable (it means that with 3 nodes and 4 partitions, s=
ay, the partitions are always allocated 2, 1, 1, so if you have node 1 down=
 and hence 0, 2, 2, and then bring up node 1, it unnecessarily moves 2 part=
itions to make 2, 1, 1 rather than the minimum move to achieve =E2=80=98bal=
ance=E2=80=99 which would be to move 1 partition from instance 2 or 3 back =
to instance 1.
> I can see the code in question in AutoRebalanceStrategy.typedComputeParti=
tionAssignment, where the distRemainder is allocated to the first nodes alp=
habetically, so that the capacity of all nodes is not equal.
> The proposed solution is to sort the nodes by the number of partitions th=
ey already have assigned, which should mean that those nodes are assigned t=
he higher capacity and the problem goes away.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)