helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: Question about participant dropping a subset of assigned partitions
Date Tue, 12 Feb 2013 07:56:32 GMT
Hi Abhishek,

Regarding the standalone agent, Santiago has started this thread
http://helix-dev.markmail.org/message/5h2fogbigexnhb4s. I think it supports
recursion where one agent can manage another cluster. This is still in
design phase and there are multiple use cases that need a similar solution.
Feel free to contribute to the design/implementation on this JIRA
https://issues.apache.org/jira/browse/HELIX-45.

If you are setting only the idealstate, I suggest you use the
customcodeinvoker. You can registers for various changes like( nodes
starting/stopping) etc. Take a look at this.
https://git-wip-us.apache.org/repos/asf?p=incubator-helix.git;a=blob;f=helix-core/src/test/java/org/apache/helix/integration/TestHelixCustomCodeRunner.java;h=9bf79b8b34c14b7ce1e3fc45a45ceb19fdac4874;hb=437eb42e

This has the advantage that you can still benefit from features like
ensuring constraints and throttling that comes with default controller. And
as i mentioned earlier, this will allow you to upgrade to a newer Helix
version with out any issues.

thanks,
Kishore G




On Thu, Feb 7, 2013 at 10:02 AM, Abhishek Rai <abhishekrai@gmail.com> wrote:

> Thanks again for the thoughtful feedback Kishore.
>
> On Sat, Feb 2, 2013 at 12:20 AM, kishore g <g.kishore@gmail.com> wrote:
>
>> Thanks Abhishek, you are thinking in the right direction.
>>
>> Your point about disable-enable happening too quickly is valid. However,
>> how fast can you detect the c++ process crash and restart. Is C++ process
>> running in a daemon mode and it is restarted automatically or the
>> helix-java-proxy agent will be responsible to start the c++ process.
>>
>
> Both options are possible at this time, but I was evaluating the
> disable-partition suggestion for soundness.  However, I agree that it's a
> highly unlikely scenario.
>
>
>> The messaging solution will work but the alternative of modeling each c++
>> process as a participant makes sense and is the right thing to do.
>>
>> Can you provide more details on "its ok if the java-proxy dies" why does
>> it not effect the correctness? when the agent is restarted does it
>> re-register the c++ instances, do you plan to store the c++ pid in Helix so
>> that the agent can remember the processes it had started?
>>
>
> Death or restarts of the java proxy may result in temporarily
> unavailability of some data until the controller rebalances the lost
> partitions.  Death or restarts of the C++ DDS process also affects
> availability but if the Java proxy stays up, then (1) controller may not
> notice the unavailability, and (2) DDS clients may continue to think that
> their data is reachable when it's not.  Sorry I misused the term
> "correctness" for describing the weaker availability in the latter case.
>
>
>>
>> The reason I am asking these questions is there is a similar effort on
>> writing a stand alone helix agent that acts as proxies for other processes
>> started on the node. In general this approach seems to be quite useful and
>> might have some common functionality that can be leverage across multiple
>> implementations.
>>
>
> That's great!  The proxy agent will be very useful.  I wonder if a good
> goal would be to enable recursion in the proxy such that the participant
> itself can be a controller for another Helix cluster.  Thus the
> proxy-participant could delegate its set of assigned resources to another
> set of participants.  This may be trivially true.
>
>
>> As for the writing the custom rebalancer, you have two options 1) as you
>> mentioned you can write inside the controller 2) there is another feature
>> called CustomCodeInvoker. You can basically write your logic to change the
>> idealstate in this and simply run it along with your java proxy and helix
>> will ensure it is actively running on only one node. This has an overhead
>> of around 50-100ms on reacting to failure but is much cleaner. If you are
>> doing 1) you need to be careful to not change existing code but simply add
>> a new stage in the pipeline. That way you will be able to upgrade Helix to
>> get new features without breaking your functionality.
>>
>
> Thanks for the suggestions.  I'm doing 1) but in a slightly different way,
> please let me know if I'm totally off in the wrong direction :-)  Or if I'm
> likely to run into problems with future Helix upgrades.
>
> I've subclassed GenericHelixController and implemented all listener
> callbacks.  This subclass registers for all events and ensures that
> GenericHelixController listeners run for each event.  Internally, it
> implements the scheduling logic that it needs and applies it via
> ZKHelixAdmin.setResourceIdealState().  Do you see any clear benefits of
> changing this to insert a new stage in GenericHelixController's pipeline?
>
> I'm following a similar scheme in the custom participant except that it
> directly registers the listeners with Helix without using a
> GenericHelixController.  I will take a closer look at CustomCodeInvoker,
> looks very useful.
>
>
>>
>> On the topic of "Helix does not have a c++ library", do you think it
>> would make it easy if there was a c++ library?. This may not that hard to
>> write because only thing that needs to be written in c++ is participant
>> code which simply acts on the message from controller. Majority of the code
>> is in controller and it can still be run as java. We are working on a
>> python agent and i hope some one will write a c++ agent.
>>
>
> Thanks for the update, yeah I am thinking of taking a stab at it in 1-2
> months if it still seems useful.
>
>
>>
>> One of the good things of modeling each c++ process as an instance is
>> that in future if there is a c++ helix agent then you can easily migrate to
>> it.
>>
>
> Cool.
> Thanks again!
> Abhishek
>
>
>>
>> Hope this helps.
>>
>> thanks,
>> Kishore G
>>
>>
>>
>> On Fri, Feb 1, 2013 at 9:44 PM, Terence Yim <chtyim@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> What do mean by "fake" live instance that you mentioned? I think the
>>> Java proxy could simply creates one HelixManager participant per C++
>>> instance (hence there are N HelixManager instances in the Java proxy) and
>>> disconnect them accordingly based on the liveness of the C++ process.
>>>
>>> Terence
>>>
>>> On Fri, Feb 1, 2013 at 7:19 PM, Abhishek Rai <abhishekrai@gmail.com>wrote:
>>>
>>>> Thanks for the quick and thoughtful response Kishore!  Comments inline.
>>>>
>>>>
>>>> On Fri, Feb 1, 2013 at 6:33 PM, kishore g <g.kishore@gmail.com> wrote:
>>>>
>>>>> Hi Abhishek,
>>>>>
>>>>> Thanks for the good question. We have two options(listed later in the
>>>>> email) for allowing a partition to drop the partitions.  However, It
works
>>>>> only in two modes (auto, custom) of three mode(auto_rebalance, auto,
>>>>> custom) the controller can support.
>>>>> More info about modes here
>>>>> http://helix.incubator.apache.org/Features.html
>>>>>
>>>>> Can you let me know which mode you are running it in?.
>>>>>
>>>>
>>>> We are planning to use CUSTOM mode since we have some specific
>>>> requirements about (1) the desired state for each partition, and (2)
>>>> scheduling partitions to instances.  Our requirements for (1) are not
>>>> expressible in the FSM framework.
>>>>
>>>> Also is it sufficient if the disabled partitions are re-assigned
>>>>> uniformly to other nodes or you want to other partitions from other nodes
>>>>> to be assigned to this node.
>>>>>
>>>>
>>>> Once a participant disables some partitions, it's alright for the
>>>> default rebalancing logic to kick in.
>>>>
>>>>
>>>>>
>>>>> Also it will help us if you can tell the use case when you need this
>>>>> feature.
>>>>>
>>>>
>>>> Sure, I'm still trying to hash things out but here is a summary.  The
>>>> DDS nodes are C++ processes, which is the crux of the problem.  AFAIK Helix
>>>> does not have a C++ library, so I'm planning to use a participant written
>>>> in Java, which runs as a separate process on the same node, receives state
>>>> transitions from the controller, and proxies them to the C++ process via
an
>>>> RPC interface.  The problem is that the C++ process and the Java-Helix
>>>> proxy can fail independently.  I'm not worried about the Java-Helix proxy
>>>> crashing since that would knock of all partitions in the C++ process from
>>>> Helix view, which does not affect correctness.
>>>>
>>>> But when the C++ process crashes, the Java-Helix proxy needs to let the
>>>> controller know asap, so the Helix "external view" can be updated,
>>>> rebalancing can start, etc.  One alternative is to invoke
>>>> "manager.disconnect()" from the Helix proxy.  But this would knock off all
>>>> partitions managed by the proxy (I want to retain the ability for the proxy
>>>> to manage multiple C++ programs).  Hence the question about selectively
>>>> dropping certain partitions, viz., the ones in a crashed C++ program.
>>>>
>>>>
>>>>> To summarize, you can achieve this in AUTO and CUSTOM but not in
>>>>> AUTO_REBALANCE mode because the goal of controller is always to assign
the
>>>>> partitions evenly among nodes. But you bring up a good use case, depending
>>>>> the  behavior we might be able to support it easily.
>>>>>
>>>>> 1. Disable a partition on a given node: Disabling a partition on a
>>>>> particular node should automatically trigger rebalancing. This can be
done
>>>>> either by admin using command line tool
>>>>> helix-admin.sh --zkSvr <ZookeeperServerAddress(Required)>
>>>>> --enablePartition <clusterName instanceName resourceName partitionName
>>>>> true/false>
>>>>>
>>>>> or programmatically if you have the access to manager, you can invoke
>>>>> this
>>>>>
>>>>> manager.getClusterManagementTool().enablePartition(enabled,
>>>>>
>>>>> clusterName,instanceName,resourceName,partitionNames);
>>>>>
>>>>> This can be done in auto and custom.
>>>>>
>>>> I am not sure this will have the right effect in the scenario described
>>>> above.  Specifically, the Java proxy would need to disable all the crashed
>>>> partitions, and then re-enable them when the C++ DDS process reboots
>>>> successfully.  If the disable-enable transitions happen too quickly, could
>>>> the controller possibly miss the transition for some partition and not do
>>>> anything?
>>>>
>>>>> 2. The other option is to change the mapping of partition --> node
in
>>>>> the ideal state. ( You can do this programmatically in custom modes and
in
>>>>> some what in auto mode as well). Doing this will send transitions to
the
>>>>> node to drop the partitions and reassign it to other nodes.
>>>>>
>>>>
>>>> Yes, this seems like the most logical thing.  The Java proxy will
>>>> probably need to send a message to the controller to trigger this change
in
>>>> the ideal states of all crashed partitions.  The messaging API would
>>>> probably be useful here.
>>>>
>>>> Another alternative I'm considering is for the Java proxy to add a
>>>> "fake" instance for each C++ process that it spawns locally.  The custom
>>>> rebalancer (that I'd write inside the controller) would then schedule the
>>>> C++ DDS partitions on to these "fake" live instances.  When the C++ process
>>>> crashes, the Java proxy would simply disconnect the corresponding fake
>>>> instance's manager.  Does this make sense to you?  Or do you have any other
>>>> thoughts?
>>>>
>>>> Thanks again for your thoughtful feedback!
>>>> Abhishek
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Kishore G
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Feb 1, 2013 at 5:44 PM, Abhishek Rai <abhishekrai@gmail.com>wrote:
>>>>>
>>>>>> Hi Helix users!
>>>>>>
>>>>>> I'm a Helix newbie and need some advice about a use case.  I'm using
>>>>>> Helix to manage a storage system which fits the description of a
DDS
>>>>>> ("distributed data service" as defined in the Helix SOCC paper).
 Each
>>>>>> participant hosts a bunch of partitions of a resource, as assigned
by the
>>>>>> controller.  The set of partitions assigned to a participant changes
>>>>>> dynamically as the controller rebalances partitions, nodes join or
leave,
>>>>>> etc.
>>>>>>
>>>>>> Additionally, I need the ability for a participant to "drop" a subset
>>>>>> of partitions currently assigned to it.  When a partition is dropped
by a
>>>>>> participant, Helix would remove the partition from the current state
of the
>>>>>> instance, update the external view, and make the partition available
for
>>>>>> rebalancing by the controller.  Does the Java API provide a way of
>>>>>> accomplishing this?  If not, are there any workarounds?  Or, was
there a
>>>>>> design rationale to disallow such actions from the participant?
>>>>>>
>>>>>> Thanks,
>>>>>> Abhishek
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message