helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhen Zhang <zzh...@linkedin.com>
Subject Re: HELP
Date Fri, 16 Jan 2015 19:13:06 GMT
Hi Jiangjie,

A few things we need to make clear when using Helix to manage your cluster.

  1.  What is the resource and how it is partitioned. Based on your description, the resource
seems to be a set of machines (servers and clients).
  2.  Who host the resource. Helix is about resource assignment in distributed systems. For
example, if you have a database, it may be partitioned and hosted by a set of nodes. In your
case, it’s not clear who host the resource.
  3.  What is the state model you are going to use?
  4.  Failure handing. In your description, if a server fails, a state transition will be
triggered on both servers and clients. It’s not clear which server should receive the notification

Once we are clear on these, it should be fairly straightforward to use Helix. You may also
be interested in looking at a few simple examples under the recipes folder (https://github.com/apache/helix/tree/master/recipes).


From: jianjie feng <augustus.feng@gmail.com<mailto:augustus.feng@gmail.com>>
Reply-To: "user@helix.apache.org<mailto:user@helix.apache.org>" <user@helix.apache.org<mailto:user@helix.apache.org>>
Date: Thursday, January 15, 2015 at 7:57 PM
To: "user@helix.apache.org<mailto:user@helix.apache.org>" <user@helix.apache.org<mailto:user@helix.apache.org>>
Subject: HELP

  we are trying to use Helix to manage our clusters ( 300+ nodes) and we now have a problem,
please help!

  let me describe it.

  our clusters is made up of servers and clients; servers are partitioned into groups ( partition
in Helix) and clients are partitioned to accordingly; now we are trying to do some fault-tolerant
thing like this:

  1) if one server-node fails, trigger a state transition (server site) , do something like
print log, trigger alarm and restart the server process;

  2)then, trigger some state transition on all client-nodes belonging this partition, do something
like kick the fail-server and release the fail server's resource on client;

  could someone please tell me how to inplement this using Helix, thanks!

  it'll be better if you could show me some code samplesl


View raw message