helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vu Nguyen <vusi...@gmail.com>
Subject helix rebalancing for multiple resources
Date Thu, 02 Jan 2014 04:31:56 GMT
We're looking into creating something like a distributed task processing
cluster.  We already have existing code for the processing task on a single
host.  So that results in stronger restrictions on what we're doing:
- partitioned task A: single partition needs to be assigned to a single
node and a node may have only a single partitioned task
- another set of non-partitioned tasks (e.g. B, C, D) also needs to be
assigned nodes, but it would be most efficient of those tasks are assigned
to separate nodes so any single node has at most 1 task (either partitioned
A, B, C, D, etc.)

This seems to require a global view of a tasks.  However, from the examples
and the Rebalancer code, it appears that the resource mappings/assignments
are independent of each another.  Is that correct?  If so, is Apache Helix
the right framework for us, given the requirements above?

I saw that it might be possible to find the current resource assignment for
other resources during the rebalancing calculation methods, but I was then
concerned about concurrency issues--if the rebalance for task A and
rebalance for B was computed at the same time.

Thanks for any and all feedback.

Vu Nguyen

View raw message