tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jihoon Son <jihoon...@apache.org>
Subject Re: JIRA-704 : TajoMaster High Availability .
Date Sun, 13 Apr 2014 05:36:33 GMT
Hi Alvin.
Thanks for your suggestion.

In overall, your suggestion looks very reasonable to me!
I'll check the POC.

Many thanks,
Jihoon
Hi All ,
            After doing lot of research in my opinion we should utilize
zookeeper for Tajo Master HA.I have created a small POC and shared it on my
Github repository ( git@github.com:alvinhenrick/zooKeeper-poc.git).

            Just to make things little bit easier and maintainable I am
utilizing Apache Curator the Fluent Zookeeper Client API  developed at
Netflix and is now part of an  apache open source project.

            I have attached the diagram to convey my message to the team
members.Will upload it to JIRA once everyone agree with the proposed
solution.

            Here is the flow going to look like.

            TajoMasterZkController   ==>


   1. This component  will start and connect to zookeeper quorum and fight
      ( :) ) to obtain the latch / lock to become the master .
      2. Once the lock is obtained the Apache Curator API will invoke
      takeLeadership () method at this time will start the TajoMaster.
      3. As long as the TajoMaster is running the Controller will keep the
      lock and update the meta data on zookeeper server with the
HOSTNAME and RPC
      PORT.
      4. The other participant will keep waiting for the latch/ lock to be
      released by zookeeper to obtain the leadership.
      5. The advantage is we can have as many Tajo Master's as we wan't but
      only one can be the leader and will consume the resources only after
      obtaining the latch/lock.


           TajoWorkerZkController ==>

   1. This component  will start and connect to zookeeper (will create
      EPHEMERAL ZNODE) and wait for the events from zookeeper.
      2. The first listener will listener for successful registration.
      3. The second listener on master node will listen for any  changes to
      the master node received from zookeeper server.
      4.  If the failover occurs the data on the master ZNODE will be
      changed and the new HOSTNAME and RPC PORT can be obtained and the
      TajoWorker can establish the new RPC connection with the TajoMaster.

          To demonstrate I have created the small Readme.txt file
on Github on how to run the example. Please read the log statements on the
console.

          Similar to TajoWorkerZkController we can also
implement TajoClientZkController.

          Any help or advice is appreciated.

Thanks!
Warm Regards,
Alvin.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message