tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xuhui Liu <maf...@gmail.com>
Subject Re: JIRA-704 : TajoMaster High Availability .
Date Thu, 17 Apr 2014 08:14:18 GMT
Talking about the HA of TajoMaster. Keeping consistence among primary
master and slave masters will be a big challenge. Have we ever thought
about the PAXOS protocol? It's designed to keep consistence in distributed
environment.

Thanks,
Daniel


On Wed, Apr 16, 2014 at 7:56 PM, Hyunsik Choi <hyunsik@apache.org> wrote:

> Hi Alvin,
>
> First of all, thank you Alvin for your contribution. Your proposal looks
> nice and reasonable for me.
>
> BTW, as other guys mentioned, TAJO-704 and TAJO-611 seem to be somewhat
> overlapped to each other. We need to arrange the tasks to avoid duplicated
> works.
>
> In my opinion, TajoMaster HA feature involves three sub features:
>   1) Leader election of multiple TajoMasters - One of multiple TajoMasters
> always is the leader TajoMaster.
>   2) Service discovery of TajoClient side - TajoClient API call should be
> resilient even though the original TajoMaster is not available.
>   3) Cluster resource management and Catalog information that TajoMaster
> keeps in main-memory. - the information should not be lost.
>
> I think that (1) and (2) are duplicated to TAJO-611 for service discovery.
> So, it would be nice if TAJO-704 should only focus on (3). It's because
> TAJO-611 already started few weeks ago and TAJO-704 may be the relatively
> earlier stage. *Instead, you can continue the work with Xuhui and Min.*
> Someone can divide the service discovery issue into more subtasks.
>
> In addition, I'd like to more discuss (3). Currently, a running TajoMaster
> keeps two information: cluster resource information of all workers and
> catalog information. In order to guarantee the HA of the data, TajoMaster
> should either persistently materialize them or consistently synchronize
> them across multiple TajoMasters. BTW, we will replace the resource
> management feature of TajoMaster into a decentralized manner in new
> scheduler issue. As a result, I think that TajoMaster HA needs to focus on
> only the high availability of catalog information. The HA of catalog can be
> easily achieved by database replication or we can make our own module for
> it. In my view, I prefer the former.
>
> Hi Xuhui and Min,
>
> Could you share the brief progress of service discovery issue? If so, we
> can easily figure out how we start the service discovery together.
>
> Warm regards,
> Hyunsik
>
>
>
> On Wed, Apr 16, 2014 at 3:36 PM, Min Zhou <coderplay@gmail.com> wrote:
>
> > Actually, we are not only thinking about the HA, but also service
> discovery
> > when the future tajo scheduler would rely on.  Tajo scheduler can get all
> > the active workers from that service.
> >
> >
> > Regards,
> > Min
> >
> >
> > On Tue, Apr 15, 2014 at 10:05 PM, Xuhui Liu <mafish@gmail.com> wrote:
> >
> > > Hi Alvin,
> > >
> > > TAJO-611 will introduce Curator as a service discovery service to Tajo
> > and
> > > Curator is based on ZK. Maybe we can work together.
> > >
> > > Thanks,
> > > Xuhui
> > >
> > >
> > > On Wed, Apr 16, 2014 at 12:17 PM, Min Zhou <coderplay@gmail.com>
> wrote:
> > >
> > > > HI Alvin,
> > > >
> > > > I think this jira has somewhat overlap with TAJO-611,  can you have
> > some
> > > > cooperation?
> > > >
> > > > Thanks,
> > > > Min
> > > >
> > > >
> > > > On Tue, Apr 15, 2014 at 7:22 PM, Henry Saputra <
> > henry.saputra@gmail.com
> > > > >wrote:
> > > >
> > > > > Jaehwa, I think we should think about pluggable mechanism that
> would
> > > > > allow some kind distributed system like ZK to be used if wanted.
> > > > >
> > > > > - Henry
> > > > >
> > > > > On Tue, Apr 15, 2014 at 7:15 PM, Jaehwa Jung <blrunner@apache.org>
> > > > wrote:
> > > > > > Hi, Alvin
> > > > > >
> > > > > > I'm sorry for late response, and thank you very much for your
> > > > > contribution.
> > > > > > I agree with your opinion for zookeeper. But, zookeeper requires
> an
> > > > > > additional dependency that someone does not want.
> > > > > >
> > > > > > I'd like to suggest adding an abstraction layer for handling
> > > TajoMaster
> > > > > HA.
> > > > > > When I had created TAJO-740, I wished that TajoMaster HA would
> > have a
> > > > > > generic interface and a basic implementation using HDFS. Next,
> your
> > > > > > proposed zookeeper implementation will be added there. It will
> > allow
> > > > > users
> > > > > > to choice their desired implementation according to their
> > > environments.
> > > > > >
> > > > > > In addition, I'd like to propose that TajoMaster embeds the
HA
> > > module,
> > > > > and
> > > > > > it would be great if HA works well by launching a backup
> > TajoMaster.
> > > > > > Deploying additional process besides TajoMaster and TajoWorker
> > > > processes
> > > > > > may give more burden to users.
> > > > > >
> > > > > > *Cheers*
> > > > > > *Jaehwa*
> > > > > >
> > > > > >
> > > > > > 2014-04-13 14:36 GMT+09:00 Jihoon Son <jihoonson@apache.org>:
> > > > > >
> > > > > >> Hi Alvin.
> > > > > >> Thanks for your suggestion.
> > > > > >>
> > > > > >> In overall, your suggestion looks very reasonable to me!
> > > > > >> I'll check the POC.
> > > > > >>
> > > > > >> Many thanks,
> > > > > >> Jihoon
> > > > > >> Hi All ,
> > > > > >>             After doing lot of research in my opinion we
should
> > > > utilize
> > > > > >> zookeeper for Tajo Master HA.I have created a small POC
and
> shared
> > > it
> > > > > on my
> > > > > >> Github repository ( git@github.com:
> > alvinhenrick/zooKeeper-poc.git).
> > > > > >>
> > > > > >>             Just to make things little bit easier and
> > maintainable I
> > > > am
> > > > > >> utilizing Apache Curator the Fluent Zookeeper Client API
> >  developed
> > > at
> > > > > >> Netflix and is now part of an  apache open source project.
> > > > > >>
> > > > > >>             I have attached the diagram to convey my message
to
> > the
> > > > team
> > > > > >> members.Will upload it to JIRA once everyone agree with
the
> > proposed
> > > > > >> solution.
> > > > > >>
> > > > > >>             Here is the flow going to look like.
> > > > > >>
> > > > > >>             TajoMasterZkController   ==>
> > > > > >>
> > > > > >>
> > > > > >>    1. This component  will start and connect to zookeeper
quorum
> > and
> > > > > fight
> > > > > >>       ( :) ) to obtain the latch / lock to become the master
.
> > > > > >>       2. Once the lock is obtained the Apache Curator API
will
> > > invoke
> > > > > >>       takeLeadership () method at this time will start the
> > > TajoMaster.
> > > > > >>       3. As long as the TajoMaster is running the Controller
> will
> > > keep
> > > > > the
> > > > > >>       lock and update the meta data on zookeeper server
with the
> > > > > >> HOSTNAME and RPC
> > > > > >>       PORT.
> > > > > >>       4. The other participant will keep waiting for the
latch/
> > lock
> > > > to
> > > > > be
> > > > > >>       released by zookeeper to obtain the leadership.
> > > > > >>       5. The advantage is we can have as many Tajo Master's
as
> we
> > > > wan't
> > > > > but
> > > > > >>       only one can be the leader and will consume the resources
> > only
> > > > > after
> > > > > >>       obtaining the latch/lock.
> > > > > >>
> > > > > >>
> > > > > >>            TajoWorkerZkController ==>
> > > > > >>
> > > > > >>    1. This component  will start and connect to zookeeper
(will
> > > create
> > > > > >>       EPHEMERAL ZNODE) and wait for the events from zookeeper.
> > > > > >>       2. The first listener will listener for successful
> > > registration.
> > > > > >>       3. The second listener on master node will listen
for any
> > > > >  changes to
> > > > > >>       the master node received from zookeeper server.
> > > > > >>       4.  If the failover occurs the data on the master
ZNODE
> will
> > > be
> > > > > >>       changed and the new HOSTNAME and RPC PORT can be obtained
> > and
> > > > the
> > > > > >>       TajoWorker can establish the new RPC connection with
the
> > > > > TajoMaster.
> > > > > >>
> > > > > >>           To demonstrate I have created the small Readme.txt
> file
> > > > > >> on Github on how to run the example. Please read the log
> > statements
> > > on
> > > > > the
> > > > > >> console.
> > > > > >>
> > > > > >>           Similar to TajoWorkerZkController we can also
> > > > > >> implement TajoClientZkController.
> > > > > >>
> > > > > >>           Any help or advice is appreciated.
> > > > > >>
> > > > > >> Thanks!
> > > > > >> Warm Regards,
> > > > > >> Alvin.
> > > > > >>
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > My research interests are distributed systems, parallel computing and
> > > > bytecode based virtual machine.
> > > >
> > > > My profile:
> > > > http://www.linkedin.com/in/coderplay
> > > > My blog:
> > > > http://coderplay.javaeye.com
> > > >
> > >
> >
> >
> >
> > --
> > My research interests are distributed systems, parallel computing and
> > bytecode based virtual machine.
> >
> > My profile:
> > http://www.linkedin.com/in/coderplay
> > My blog:
> > http://coderplay.javaeye.com
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message