helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: State transitions of partitions
Date Fri, 01 Mar 2013 07:40:22 GMT
Hi Vinayak,

This is how you can make the operation idempotent, irrespective of 1)  2)
or 3) you always use clustermessagingservice first to check if any node
already has the data and you can pull from them. If you dont get response
from other nodes in the system,  then you fall back to 2) and use the data
you already have else you assume you are creating the resource for the
first time. Search system at LinkedIn used to do this earlier but now they
kind of dont have this requirement since they assume the indexes are
available when the node starts. I might be wrong here, but anyhow you get
the idea.

Messaging infrastructure is generic enough but not intended to be a RPC
mechanism between nodes. They communicate via zookeeper, so cant really be
used to achieve high throughput/latency. Again the reason here is allows
messages to be persistent across node restarts to ensure every node
processed the message. In case where you dont need such persistence its
possible to extend the messaging service to do RPC between nodes.

Shirshanka is developing a helix container module based on netty that might
allow one to extend the messaging service to do RPC within the cluster and
will be useful to transfer files between nodes efficiently without getting
the data into application memory by using sendfile api. Its actually a good
utility and I can see it being used in multiple systems.

Kishore G

On Thu, Feb 28, 2013 at 9:44 AM, Vinayak Borkar <vborky@yahoo.com> wrote:

>> Will this address your problem, we dont have distinct actions based on
>> ERROR codes that controller will understand and take different actions.
>> Were you looking for something like that ?
> I will need to think more about this. I think the retry mechnism might be
> good enough for now.
>> Good point on not differentiating if the partition once existed v/s newly
>> created.  We actually plan to modify the drop notification
>> behavior. Jason/Terence are discussing about this in another thread.
>> Please
>> add your suggestion to that thread. We should probably have a create and
>> drop method(not transition) on the participants.
> Currently, how do other systems that use Helix handle the bootstrapping
> process? When a resource is created for the first time, the actions of a
> participant are different as compared to other times when a resource
> partition is expanded to use another instance. Specifically, there are
> three cases that need to be handled with respect to bootstrapping:
> 1. A cluster is up and running, and a new resource is created and
> rebalanced.
> 2. A cluster that had resources is being started after being shutdown
> 3. A cluster is running and a resource is already laid out on the cluster.
> Then some partitions are moved to instances that previously did not have
> any partitions of that resource.
> I looked through the examples and found the ClusterMessagingService
> interface that can be used to send messages to instances in the cluster. I
> can see 3 can be handled by using the messaging infrastructure. However,
> both 1 and 2 will have the resource partitions start in the OFFLINE mode.
> The messaging API cannot help because all instances in the cluster are in
> the same boat for a particular resource in case 1 and case 2. So what is
> the preferred way to know if you are in case 1 or in case 2? One way I see
> is that if you have local artifacts matching the partitions that are
> transiting from OFFLINE -> SLAVE mode, one could infer it is case 2. Is
> that how other systems solve this issue?
> On a separate note, is the messaging infrastructure general purpose? As in
> can that be used by applications to perform RPC in the cluster obviating
> the need for a separate RPC mechanism like Avro? I can see that the handler
> will need more code than one would need to write when using Avro to get RPC
> working, but my question is about the design point of the messaging
> infrastructure.
> Thanks,
> Vinayak

View raw message