tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jaehwa Jung <blrun...@apache.org>
Subject Re: JIRA-704 : TajoMaster High Availability .
Date Wed, 16 Apr 2014 02:15:43 GMT
Hi, Alvin

I'm sorry for late response, and thank you very much for your contribution.
I agree with your opinion for zookeeper. But, zookeeper requires an
additional dependency that someone does not want.

I'd like to suggest adding an abstraction layer for handling TajoMaster HA.
When I had created TAJO-740, I wished that TajoMaster HA would have a
generic interface and a basic implementation using HDFS. Next, your
proposed zookeeper implementation will be added there. It will allow users
to choice their desired implementation according to their environments.

In addition, I'd like to propose that TajoMaster embeds the HA module, and
it would be great if HA works well by launching a backup TajoMaster.
Deploying additional process besides TajoMaster and TajoWorker processes
may give more burden to users.


2014-04-13 14:36 GMT+09:00 Jihoon Son <jihoonson@apache.org>:

> Hi Alvin.
> Thanks for your suggestion.
> In overall, your suggestion looks very reasonable to me!
> I'll check the POC.
> Many thanks,
> Jihoon
> Hi All ,
>             After doing lot of research in my opinion we should utilize
> zookeeper for Tajo Master HA.I have created a small POC and shared it on my
> Github repository ( git@github.com:alvinhenrick/zooKeeper-poc.git).
>             Just to make things little bit easier and maintainable I am
> utilizing Apache Curator the Fluent Zookeeper Client API  developed at
> Netflix and is now part of an  apache open source project.
>             I have attached the diagram to convey my message to the team
> members.Will upload it to JIRA once everyone agree with the proposed
> solution.
>             Here is the flow going to look like.
>             TajoMasterZkController   ==>
>    1. This component  will start and connect to zookeeper quorum and fight
>       ( :) ) to obtain the latch / lock to become the master .
>       2. Once the lock is obtained the Apache Curator API will invoke
>       takeLeadership () method at this time will start the TajoMaster.
>       3. As long as the TajoMaster is running the Controller will keep the
>       lock and update the meta data on zookeeper server with the
>       PORT.
>       4. The other participant will keep waiting for the latch/ lock to be
>       released by zookeeper to obtain the leadership.
>       5. The advantage is we can have as many Tajo Master's as we wan't but
>       only one can be the leader and will consume the resources only after
>       obtaining the latch/lock.
>            TajoWorkerZkController ==>
>    1. This component  will start and connect to zookeeper (will create
>       EPHEMERAL ZNODE) and wait for the events from zookeeper.
>       2. The first listener will listener for successful registration.
>       3. The second listener on master node will listen for any  changes to
>       the master node received from zookeeper server.
>       4.  If the failover occurs the data on the master ZNODE will be
>       changed and the new HOSTNAME and RPC PORT can be obtained and the
>       TajoWorker can establish the new RPC connection with the TajoMaster.
>           To demonstrate I have created the small Readme.txt file
> on Github on how to run the example. Please read the log statements on the
> console.
>           Similar to TajoWorkerZkController we can also
> implement TajoClientZkController.
>           Any help or advice is appreciated.
> Thanks!
> Warm Regards,
> Alvin.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message