helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: Newbie Questions
Date Wed, 12 Feb 2014 09:44:25 GMT
Hi Sandeep,

1) Difference between Apache Helix and Norbert.
While both projects originated at LinkedIn, there are some fundamental
differences between the two. Will try to explain the difference via a
simple use case of partitioned search index. In general you have 3 scenarios

   - Start up --> Partition assignment: One needs to distribute the
   partitions among the nodes in the cluster.
      - In Norbert this process is manual. One needs to generate the
      mapping of partition to node and push this configuration to each each
      server in the system needs (this is generally done via a config file). At
      start up, Norbert will simply write this configuration to
zookeeper so that
      the clients can discover the partition to node mapping.
      - In Helix, the nodes simply join the cluster. Helix will inform the
      nodes which partitions to host based on the objectives and constraints (A
      simple objective would be to distribute the partitions evenly among the
      - Failure: When a node fails, you have multiple options #1. Do
   nothing #2. Re-assign the partitions hosted on the failed nodes to the
   remaining nodes. #3 Start a new node and assign the partitions to that new
      - Norbert simply informs you that a node left the group. You need to
      program your requirement.
      - Helix is capable of doing #1, #2 or even #3. #3 feature is work in
      progress and is possible if the deployment system is flexible and allows
      starting up process dynamically, If you are deploying in EC2, this should
      be possible.
   - Scalling: If you add more nodes to handle work load, you would want to
   redistribute some of the work to new nodes.
      - Norbert: This is again manual, you would have to change the
      configuration and re-start all the nodes
      - Helix: Helix would detect new nodes and fire appropriate

As you already mentioned Helix treats partitions, replicas, state,
transitions as first class citizens. What this means is you can not only
say how many partitions you have but also mention the number of replicas
for each partition. For example, for redundancy you can say you need 3
replicas for each partition and Helix will ensure that 3 replicas exist for
each partition.

2. No we haven't used Helix on Amazon, not sure if any one has done this.
There was another thread about this and he dint seem to think it would be a
big problem. It will help if you give us more information about your
application and set up.

3. "Incubating" indicates the state of the project in Apache, its does not
reflect the quality of the code or production readiness. Apart from
LinkedIn, companies like (Box, Instagram, Jboss jBPM clustering) have used
Helix in production.

Hope this helps.


Kishore G

On Wed, Feb 12, 2014 at 12:55 AM, Sandeep Nayak <sandeep@chegg.com> wrote:

> Hi,
> I am a newbie to Apache Helix and am evaluating technologies to build
> clustered services. I have read through the documentation on the
> http://helix.incubator.apache.org/ site but did not find answers to the
> questions below so decided to ask them here.
> (1) What is the difference between Apache Helix and LinkedIn Norbert? I
> believe Norbert does not support state-machine transitions like Helix but
> is there a document/summary on what are the differences so someone like me
> can use that in my evaluation?
> (2) Has Apache Helix been used on Amazon? Is there any documentation on
> how to get this working?
> (3) The website as 0.6.2-incubating-stable and 0.7.0-incubating-alpha.
> Does incubating indicate the state of the project in Apache or is it
> indicative of the production-readiness of the library? I imagine the
> former because prior to getting to Apache the library was used at
> linked-in, am I correct?
> Thanks in advance,
> Sandeep

View raw message