helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: Questions about custom helix rebalancer/controller/agent
Date Fri, 01 Aug 2014 17:33:37 GMT
Deleting a resource will trigger master->offline transitions. In addition
to that it will trigger offline->dropped transition. If you plan to clean
up resources, offline->dropped is the right place to do that.




On Fri, Aug 1, 2014 at 10:30 AM, Varun Sharma <varun@pinterest.com> wrote:

> In my case, I will have many resources - like say upto a 100 resources.
> Each of them will have partitions in the range of 100-5K.  So I guess, I do
> require the bucket size. 300K partitions is the sum of partitions across
> all resources, rather than the # of partitions within a single resource.
>
> Another question, I had was regarding removing a resource in Helix. When a
> removeResource is called from HelixAdmin, would it trigger the
> MASTER->OFFLINE the respective partitions before the resource is removing ?
> To concretize my use case, we have many resources with a few thousand
> partitions being loaded every day. New versions of the resources keep
> getting loaded as brand new resources into Helix and the older versions are
> decommissioned/garbage collected. So we would be issuing upto a 100 or so
> resource additions per day and upto a 100 or so resource deletions every
> day. Just want to check that deleting a resource would also trigger the
> appropriate MASTER->OFFLINE transitions.
>
> Thanks
> Varun
>
>
> On Fri, Aug 1, 2014 at 10:18 AM, Kanak Biscuitwala <kanak.b@hotmail.com>
> wrote:
>
>> a) By default, there is one znode per resource, which as you know is a
>> grouping of partitions. The biggest limitation is that ZK has a 1MB limit
>> on znode sizing. To get around this, Helix has the concept of bucketizing,
>> where in your ideal state, you can set a bucket size, which will
>> effectively create that many znodes to fully represent all your partitions.
>> I believe that you can have ~2k partitions before you start needing to
>> bucketize.
>>
>> 300k may cause you separate issues, and you may want to consider doing
>> things like enabling batch message mode in your ideal state so that each
>> message we send to an instance contains transitions for all partitions
>> hosted on that instance, rather than creating a znode per partition state
>> change. However, in theory (we've never played with this many in practice),
>> Helix should be able to function correctly with that many partitions.
>>
>> b) Yes, if you have a hard limit of 1 master per partition, Helix will
>> transition the first node to OFFLINE before sending the MASTER transition
>> to the new master.
>>
>> Kanak
>>
>> ------------------------------
>> Date: Fri, 1 Aug 2014 10:09:24 -0700
>>
>> Subject: Re: Questions about custom helix rebalancer/controller/agent
>> From: varun@pinterest.com
>> To: user@helix.apache.org
>>
>> Sounds fine to me. I can work without the FINALIZE notification for now,
>> but I hope its going to come out soon. A few more questions:
>>
>> a) How well does Helix scale with partitions - is each partition a
>> separate znode inside helix ? If I have 300K partitions in Helix would that
>> be an issue ?
>> b) If a partition which was assigned as a master on node1 is now assigned
>> as a master on node2, will node1 get a callback execution for transition
>> from MASTER-->OFFLINE
>>
>> Thanks
>> Varun
>>
>>
>> On Thu, Jul 31, 2014 at 11:18 PM, Kanak Biscuitwala <kanak.b@hotmail.com>
>> wrote:
>>
>> s/run/start/g -- sorry about that, fixed in javadocs for future releases
>>
>> You may need to register for a notification type; I believe
>> HelixCustomCodeRunner complains if you don't. However, you can simply
>> ignore that notification type, and just check for INIT and FINALIZE
>> notification types in your callback to to track whether or not you're the
>> leader. On INIT, you start your 30 minute timer, and on FINALIZE you stop
>> it. You may need to wait for us to make a 0.6.4 release (we will likely do
>> this soon) to get the FINALIZE notification.
>>
>> Here is an example of a custom code runner usage:
>> Registration:
>> https://github.com/kishoreg/fullmatix/blob/master/mysql-cluster/src/main/java/org/apache/fullmatix/mysql/MySQLAgent.java
>> Callback:
>> https://github.com/kishoreg/fullmatix/blob/master/mysql-cluster/src/main/java/org/apache/fullmatix/mysql/MasterSlaveRebalancer.java
>>
>> Regarding setting up the Helix controller, you actually don't need to
>> instantiate a GenericHelixController. If you create a HelixManager with
>> InstanceType.CONTROLLER, then ZKHelixManager automatically creates a
>> GenericHelixController and sets it up with leader election. We really
>> should update the documentation to clarify that.
>>
>> ------------------------------
>> Date: Thu, 31 Jul 2014 23:00:13 -0700
>>
>> Subject: Re: Questions about custom helix rebalancer/controller/agent
>> From: varun@pinterest.com
>> To: user@helix.apache.org
>>
>> Thanks for the suggestions..
>>
>> Seems like the HelixCustomCodeRunner could do it. However, it seems like
>> the CustomCodeRunner only provides hooks for plugging into notifications.
>> The documentation example in the above link suggests a run() method, which
>> does not seem to exist.
>>
>> However, this maybe sufficient for my case. I essentially hook in an
>> empty CustomCodeRunner into my helix manager. Then I can instantiate my own
>> thread which would run above snippet and keep writing ideal states every 30
>> minutes. I guess I would still need to attach the GenericHelixController
>> with the following code snippet to take action whenever the ideal state
>> changes ??
>>
>> GenericHelixController controller = new GenericHelixController();
>>      manager.addConfigChangeListener(controller);
>>      manager.addLiveInstanceChangeListener(controller);
>>      manager.addIdealStateChangeListener(controller);
>>      manager.addExternalViewChangeListener(controller);
>>      manager.addControllerListener(controller);
>>
>>
>>
>>
>>
>> On Thu, Jul 31, 2014 at 6:01 PM, kishore g <g.kishore@gmail.com> wrote:
>>
>> List resourceList = helixAdmin.getResourceList();
>> for each resource:
>>    Compute target ideal state
>>    helixAdmin.setIdealState(resource, targetIdealState);
>>
>> Thread.sleep(30minutes);
>>
>> This can work right. This code can be as part of CustomCodeRunner.
>> http://helix.apache.org/javadocs/0.6.3/reference/org/apache/helix/participant/HelixCustomCodeRunner.html.
>> You can say you are interested in notifications but can ignore that.
>>
>> thanks,
>> Kishore G
>>
>>
>> On Thu, Jul 31, 2014 at 5:45 PM, Kanak Biscuitwala <kanak.b@hotmail.com>
>> wrote:
>>
>> i.e. helixAdmin.enableCluster(clusterName, false);
>>
>> ------------------------------
>> From: kanak.b@hotmail.com
>> To: user@helix.apache.org
>> Subject: RE: Questions about custom helix rebalancer/controller/agent
>> Date: Thu, 31 Jul 2014 17:44:40 -0700
>>
>>
>> Unfortunately HelixAdmin#rebalance is a misnomer, and it is a function of
>> all the configured instances and not the live instances. The closest you
>> can get to that is to use the third option I listed related to CUSTOMIZED
>> mode, where you write the mappings yourself based on what is live.
>>
>> Another thing you could do is pause the cluster controller and unpause it
>> for a period every 30 minutes. That will essentially enforce that the
>> controller will not send transitions (or do anything else, really) during
>> the time it is paused. This sounds a little like a hack to me, but it may
>> do what you want.
>>
>> Kanak
>>
>> ------------------------------
>> Date: Thu, 31 Jul 2014 17:39:40 -0700
>> Subject: Re: Questions about custom helix rebalancer/controller/agent
>> From: varun@pinterest.com
>> To: user@helix.apache.org
>>
>> Thanks Kanak, for your detailed response and this is really very helpful.
>> I was wondering if its possible for me do something like the following:
>>
>> List resourceList = helixAdmin.getResourceList();
>> for each resource:
>>    Compute target ideal state
>>    helixAdmin.rebalance(resource);
>>
>> Thread.sleep(30minutes);
>>
>> So, the above happens inside a while loop thread and this is the only
>> place where we do the rebalancing ?
>>
>> Thanks
>> Varun
>>
>>
>> On Thu, Jul 31, 2014 at 5:25 PM, Kanak Biscuitwala <kanak.b@hotmail.com>
>> wrote:
>>
>>  Hi Varun,
>>
>> Sorry for the delay.
>>
>> 1 and 3) There are a number of ways to do this, with various tradeoffs.
>>
>> - You can write a user-defined rebalancer. In helix 0.6.x, it involves
>> implementing the following interface:
>>
>>
>> https://github.com/apache/helix/blob/helix-0.6.x/helix-core/src/main/java/org/apache/helix/controller/rebalancer/Rebalancer.java
>>
>> Essentially what it does is given an existing ideal state, compute a new
>> ideal state. For 0.6.x, this will read the preference lists in the output
>> ideal state and compute a state mapping based on them. If you need more
>> control, you can also implement:
>>
>>
>> https://github.com/apache/helix/blob/helix-0.6.x/helix-core/src/main/java/org/apache/helix/controller/rebalancer/internal/MappingCalculator.java
>>
>> which will allow you to create a mapping from partition to map of
>> participant and state. In 0.7.x, we consolidated these into a single method.
>>
>> Here is a tutorial on the user-defined rebalancer:
>> http://helix.apache.org/0.6.3-docs/tutorial_user_def_rebalancer.html
>>
>> Now, running this every 30 minutes is tricky because by default the
>> controller responds to all cluster events (and really it needs to because
>> it aggregates all participant current states into the external view --
>> unless you don't care about that).
>>
>> - Combined with the user-defined rebalancer (or not), you can have a
>> GenericHelixController that doesn't listen on any events, but calls
>> startRebalancingTimer(), into which you can pass 30 minutes. The problem
>> with this is that the instructions at
>> http://helix.apache.org/0.6.3-docs/tutorial_controller.html won't work
>> as described because of a known issue. The workaround is to connect
>> HelixManager as role ADMINISTRATOR instead of CONTROLLER.
>>
>> However, if you connect as ADMINISTRATOR, you have to set up leader
>> election yourself (assuming you want a fault-tolerant controller). See
>> https://github.com/apache/helix/blob/helix-0.6.x/helix-core/src/main/java/org/apache/helix/manager/zk/DistributedLeaderElection.java
for
>> a controller change listener that can do leader election, but your version
>> will have to be different, as you actually don't want to add listeners, but
>> rather set up a timer.
>>
>> This also gives you the benefit of plugging in your own logic into the
>> controller pipeline. See
>> https://github.com/apache/helix/blob/helix-0.6.x/helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java
createDefaultRegistry()
>> for how to create an appropriate PipelineRegistry.
>>
>> - You can take a completely different approach and put your ideal state
>> in CUSTOMIZED rebalance mode. Then you can have a meta-resource where one
>> participant is a leader and the others are followers (you can create an
>> ideal state in SEMI_AUTO mode, where the replica count and the replica
>> count and preference list of resourceName_0 is "ANY_LIVEINSTANCE". When one
>> participant is told to become leader, you can set a timer for 30 minutes
>> and update and write the map fields of the ideal state accordingly.
>>
>> 2) I'm not sure I understand the question. If you're in the JVM, you
>> simply need to connect as a PARTICIPANT for your callbacks, but that can
>> just be something you do at the beginning of your node startup. The rest of
>> your code is more or less governed by your transitions, but if there are
>> things you need to do on the side, there is nothing in Helix preventing you
>> from doing so. See
>> http://helix.apache.org/0.6.3-docs/tutorial_participant.html for
>> participant logic.
>>
>> 4) The current state is per-instance and is literally called
>> CurrentState. For a given participant, you can query a current state by
>> doing something like:
>>
>> HelixDataAccessor accessor = helixManager.getHelixDataAccessor();
>> CurrentState currentState =
>> accessor.getProperty(accessor.keyBuilder().currentState(instanceName,
>> sessionId, resourceName);
>>
>> If you implement a user-defined rebalancer as above, we automatically
>> aggregate all these current states into a CurrentStateOutput object.
>>
>> 5) You can use a Helix spectator:
>>
>> http://helix.apache.org/0.6.3-docs/tutorial_spectator.html
>>
>> This basically gives you a live-updating routing table for the mappings
>> of the Helix-managed resource. However, it requires the external view to be
>> up to date, going back to my other point of perhaps separating the concept
>> of changing mappings every 30 minutes from the frequency at which the
>> controller runs.
>>
>> Hopefully this helps.
>>
>> Kanak
>>
>> ------------------------------
>> Date: Thu, 31 Jul 2014 12:13:27 -0700
>> Subject: Questions about custom helix rebalancer/controller/agent
>> From: varun@pinterest.com
>> To: user@helix.apache.org
>>
>>
>> Hi,
>>
>> I am trying to write a customized rebalancing algorithm. I would like to
>> run the rebalancer every 30 minutes inside a single thread. I would also
>> like to completely disable Helix triggering the rebalancer.
>>
>> I have a few questions:
>> 1) What's the best way to run the custom controller ? Can I simply
>> instantiate a ZKHelixAdmin object and then keep running my rebalancer
>> inside a thread or do I need to do something more.
>>
>> Apart from rebalancing, I want to do other things inside the the
>> controller, so it would be nice if I could simply fire up the controller
>> through code. I could not find this in the documentation.
>>
>> 2) Same question for the Helix agent. My Helix Agent is a JVM process
>> which does other things apart from exposing the callbacks for state
>> transitions. Is there a code sample for the same ?
>>
>> 3) How do I disable Helix triggered rebalancing once I am able to run the
>> custom controller ?
>>
>> 4) During my custom rebalance run, how I can get the current cluster
>> state - is it through ClusterDataCache.getIdealState() ?
>>
>> 5) For clients talking to the cluster, does helix provide an easy
>> abstraction to find the partition distribution for a helix resource ?
>>
>> Thanks
>>
>>
>>
>>
>>
>>
>

Mime
View raw message