tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyunsik Choi <hyun...@apache.org>
Subject Re: JIRA-704 : TajoMaster High Availability .
Date Wed, 16 Apr 2014 11:56:06 GMT
Hi Alvin,

First of all, thank you Alvin for your contribution. Your proposal looks
nice and reasonable for me.

BTW, as other guys mentioned, TAJO-704 and TAJO-611 seem to be somewhat
overlapped to each other. We need to arrange the tasks to avoid duplicated
works.

In my opinion, TajoMaster HA feature involves three sub features:
  1) Leader election of multiple TajoMasters - One of multiple TajoMasters
always is the leader TajoMaster.
  2) Service discovery of TajoClient side - TajoClient API call should be
resilient even though the original TajoMaster is not available.
  3) Cluster resource management and Catalog information that TajoMaster
keeps in main-memory. - the information should not be lost.

I think that (1) and (2) are duplicated to TAJO-611 for service discovery.
So, it would be nice if TAJO-704 should only focus on (3). It's because
TAJO-611 already started few weeks ago and TAJO-704 may be the relatively
earlier stage. *Instead, you can continue the work with Xuhui and Min.*
Someone can divide the service discovery issue into more subtasks.

In addition, I'd like to more discuss (3). Currently, a running TajoMaster
keeps two information: cluster resource information of all workers and
catalog information. In order to guarantee the HA of the data, TajoMaster
should either persistently materialize them or consistently synchronize
them across multiple TajoMasters. BTW, we will replace the resource
management feature of TajoMaster into a decentralized manner in new
scheduler issue. As a result, I think that TajoMaster HA needs to focus on
only the high availability of catalog information. The HA of catalog can be
easily achieved by database replication or we can make our own module for
it. In my view, I prefer the former.

Hi Xuhui and Min,

Could you share the brief progress of service discovery issue? If so, we
can easily figure out how we start the service discovery together.

Warm regards,
Hyunsik



On Wed, Apr 16, 2014 at 3:36 PM, Min Zhou <coderplay@gmail.com> wrote:

> Actually, we are not only thinking about the HA, but also service discovery
> when the future tajo scheduler would rely on.  Tajo scheduler can get all
> the active workers from that service.
>
>
> Regards,
> Min
>
>
> On Tue, Apr 15, 2014 at 10:05 PM, Xuhui Liu <mafish@gmail.com> wrote:
>
> > Hi Alvin,
> >
> > TAJO-611 will introduce Curator as a service discovery service to Tajo
> and
> > Curator is based on ZK. Maybe we can work together.
> >
> > Thanks,
> > Xuhui
> >
> >
> > On Wed, Apr 16, 2014 at 12:17 PM, Min Zhou <coderplay@gmail.com> wrote:
> >
> > > HI Alvin,
> > >
> > > I think this jira has somewhat overlap with TAJO-611,  can you have
> some
> > > cooperation?
> > >
> > > Thanks,
> > > Min
> > >
> > >
> > > On Tue, Apr 15, 2014 at 7:22 PM, Henry Saputra <
> henry.saputra@gmail.com
> > > >wrote:
> > >
> > > > Jaehwa, I think we should think about pluggable mechanism that would
> > > > allow some kind distributed system like ZK to be used if wanted.
> > > >
> > > > - Henry
> > > >
> > > > On Tue, Apr 15, 2014 at 7:15 PM, Jaehwa Jung <blrunner@apache.org>
> > > wrote:
> > > > > Hi, Alvin
> > > > >
> > > > > I'm sorry for late response, and thank you very much for your
> > > > contribution.
> > > > > I agree with your opinion for zookeeper. But, zookeeper requires
an
> > > > > additional dependency that someone does not want.
> > > > >
> > > > > I'd like to suggest adding an abstraction layer for handling
> > TajoMaster
> > > > HA.
> > > > > When I had created TAJO-740, I wished that TajoMaster HA would
> have a
> > > > > generic interface and a basic implementation using HDFS. Next, your
> > > > > proposed zookeeper implementation will be added there. It will
> allow
> > > > users
> > > > > to choice their desired implementation according to their
> > environments.
> > > > >
> > > > > In addition, I'd like to propose that TajoMaster embeds the HA
> > module,
> > > > and
> > > > > it would be great if HA works well by launching a backup
> TajoMaster.
> > > > > Deploying additional process besides TajoMaster and TajoWorker
> > > processes
> > > > > may give more burden to users.
> > > > >
> > > > > *Cheers*
> > > > > *Jaehwa*
> > > > >
> > > > >
> > > > > 2014-04-13 14:36 GMT+09:00 Jihoon Son <jihoonson@apache.org>:
> > > > >
> > > > >> Hi Alvin.
> > > > >> Thanks for your suggestion.
> > > > >>
> > > > >> In overall, your suggestion looks very reasonable to me!
> > > > >> I'll check the POC.
> > > > >>
> > > > >> Many thanks,
> > > > >> Jihoon
> > > > >> Hi All ,
> > > > >>             After doing lot of research in my opinion we should
> > > utilize
> > > > >> zookeeper for Tajo Master HA.I have created a small POC and shared
> > it
> > > > on my
> > > > >> Github repository ( git@github.com:
> alvinhenrick/zooKeeper-poc.git).
> > > > >>
> > > > >>             Just to make things little bit easier and
> maintainable I
> > > am
> > > > >> utilizing Apache Curator the Fluent Zookeeper Client API
>  developed
> > at
> > > > >> Netflix and is now part of an  apache open source project.
> > > > >>
> > > > >>             I have attached the diagram to convey my message
to
> the
> > > team
> > > > >> members.Will upload it to JIRA once everyone agree with the
> proposed
> > > > >> solution.
> > > > >>
> > > > >>             Here is the flow going to look like.
> > > > >>
> > > > >>             TajoMasterZkController   ==>
> > > > >>
> > > > >>
> > > > >>    1. This component  will start and connect to zookeeper quorum
> and
> > > > fight
> > > > >>       ( :) ) to obtain the latch / lock to become the master
.
> > > > >>       2. Once the lock is obtained the Apache Curator API will
> > invoke
> > > > >>       takeLeadership () method at this time will start the
> > TajoMaster.
> > > > >>       3. As long as the TajoMaster is running the Controller
will
> > keep
> > > > the
> > > > >>       lock and update the meta data on zookeeper server with
the
> > > > >> HOSTNAME and RPC
> > > > >>       PORT.
> > > > >>       4. The other participant will keep waiting for the latch/
> lock
> > > to
> > > > be
> > > > >>       released by zookeeper to obtain the leadership.
> > > > >>       5. The advantage is we can have as many Tajo Master's as
we
> > > wan't
> > > > but
> > > > >>       only one can be the leader and will consume the resources
> only
> > > > after
> > > > >>       obtaining the latch/lock.
> > > > >>
> > > > >>
> > > > >>            TajoWorkerZkController ==>
> > > > >>
> > > > >>    1. This component  will start and connect to zookeeper (will
> > create
> > > > >>       EPHEMERAL ZNODE) and wait for the events from zookeeper.
> > > > >>       2. The first listener will listener for successful
> > registration.
> > > > >>       3. The second listener on master node will listen for any
> > > >  changes to
> > > > >>       the master node received from zookeeper server.
> > > > >>       4.  If the failover occurs the data on the master ZNODE
will
> > be
> > > > >>       changed and the new HOSTNAME and RPC PORT can be obtained
> and
> > > the
> > > > >>       TajoWorker can establish the new RPC connection with the
> > > > TajoMaster.
> > > > >>
> > > > >>           To demonstrate I have created the small Readme.txt
file
> > > > >> on Github on how to run the example. Please read the log
> statements
> > on
> > > > the
> > > > >> console.
> > > > >>
> > > > >>           Similar to TajoWorkerZkController we can also
> > > > >> implement TajoClientZkController.
> > > > >>
> > > > >>           Any help or advice is appreciated.
> > > > >>
> > > > >> Thanks!
> > > > >> Warm Regards,
> > > > >> Alvin.
> > > > >>
> > > >
> > >
> > >
> > >
> > > --
> > > My research interests are distributed systems, parallel computing and
> > > bytecode based virtual machine.
> > >
> > > My profile:
> > > http://www.linkedin.com/in/coderplay
> > > My blog:
> > > http://coderplay.javaeye.com
> > >
> >
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message