incubator-s4-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "kishore gopalakrishna (JIRA)" <>
Subject [jira] [Commented] (S4-110) Enhance cluster management features in S4
Date Tue, 22 Jan 2013 20:14:13 GMT


kishore gopalakrishna commented on S4-110:

Good point Aimee, I agree that different PE can be partitioned differently and performance
is a good enough reason to support that. It is possible to solve this problem in Helix. 

If you look at the implementation(though its not clear from usage instructions), it is not
the stream that is actually partitioned it is actually the task that processes a stream that
is partitioned. A task basically says i need to process this stream and i want to be partitioned
into p ways. Currently the attribute associated with a task is stream name, partition . The
sender basically computes the reverse mapping stream name, partition to a task and send the
event to the node where the task is running.

To support the feature you described, its possible to add another attribute to the task like
a petype( by default this can be ALL). And each task can be partitioned differently. 

The sender will send an event to all tasks that are interested in a stream.(We can optionally
support sending event to a task that is processing a particular stream and petype).

On the App side we will have to support registering to a stream and optionally to a stream,petype.

It does involve some change in s4 but it wont change the overall design much. 

Does it make sense? 

> Enhance cluster management features in S4
> -----------------------------------------
>                 Key: S4-110
>                 URL:
>             Project: Apache S4
>          Issue Type: New Feature
>            Reporter: kishore gopalakrishna
>            Assignee: kishore gopalakrishna
>             Fix For: 0.6
> In S4 the number of partition is fixed for all streams and is dependent on the number
of nodes in the cluster.  Adding new nodes to S4 cluster causes the number of partitions to
change. This results in lot of data movement. For example if there are 4 nodes and you add
another node then nearly all keys will be remapped which result is huge data movement where
as ideally only 20% of the data should move.
> By using Helix, every stream can be  partitioned differently and independent of the number
of nodes. Helix distributes the partitions evenly among the nodes. When new nodes are added,
partitions can be migrated to new nodes without changing the number of partitions and  minimizes
the data movement.
> In S4 handles failures by having stand by nodes that are idle most of the time and become
active when a node fails. Even though this works, its not ideal in terms of efficient hardware
usage since the stand by nodes are idle most of the time. This also increases the fail over
time since the PE state has to be transfered to only one node. 
> Helix allows S4 to have Active and Standby nodes at a partition level so that all nodes
can be active but some partitions will be Active and some in stand by mode. When a node fails,
the partitions that were  Active on that node will be evenly distributed among the remaining
nodes. This provides automatic load balancing and also improves fail over time, since PE state
can be transfered to multiple nodes in parallel. 
> I have a prototype implementation here
> Instructions to build it and try it out are in the Readme.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message