tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xuhui Liu <maf...@gmail.com>
Subject Re: JIRA-704 : TajoMaster High Availability .
Date Sun, 20 Apr 2014 09:46:27 GMT
Hi Guys,

I have attached a latest patch for service discovery at
https://issues.apache.org/jira/browse/TAJO-611.
1. I only added the service and didn't modify any code to use the service.
2. Zookeeper should be added to tajo, this work hasn't started yet.

We can have a discussion on how to introduce Zookeeper to Tajo.

Thanks,
Xuhui


On Thu, Apr 17, 2014 at 4:32 PM, Azuryy Yu <azuryyyu@gmail.com> wrote:

> Xuhui,
>
> ZK is not base on PAXOS, instead, it use Zab(ZooKeeper Atomic Broadcast),
> which is different from PAXOS
>
>
>
> On Thu, Apr 17, 2014 at 4:19 PM, Xuhui Liu <mafish@gmail.com> wrote:
>
> > It seems ZK is based on PAXOS. The it will be much simpler. We can focus
> on
> > how to use ZK well.
> >
> > Cheers,
> > Xuhui
> >
> >
> > On Thu, Apr 17, 2014 at 4:14 PM, Xuhui Liu <mafish@gmail.com> wrote:
> >
> > > Talking about the HA of TajoMaster. Keeping consistence among primary
> > > master and slave masters will be a big challenge. Have we ever thought
> > > about the PAXOS protocol? It's designed to keep consistence in
> > distributed
> > > environment.
> > >
> > > Thanks,
> > > Daniel
> > >
> > >
> > > On Wed, Apr 16, 2014 at 7:56 PM, Hyunsik Choi <hyunsik@apache.org>
> > wrote:
> > >
> > >> Hi Alvin,
> > >>
> > >> First of all, thank you Alvin for your contribution. Your proposal
> looks
> > >> nice and reasonable for me.
> > >>
> > >> BTW, as other guys mentioned, TAJO-704 and TAJO-611 seem to be
> somewhat
> > >> overlapped to each other. We need to arrange the tasks to avoid
> > duplicated
> > >> works.
> > >>
> > >> In my opinion, TajoMaster HA feature involves three sub features:
> > >>   1) Leader election of multiple TajoMasters - One of multiple
> > TajoMasters
> > >> always is the leader TajoMaster.
> > >>   2) Service discovery of TajoClient side - TajoClient API call should
> > be
> > >> resilient even though the original TajoMaster is not available.
> > >>   3) Cluster resource management and Catalog information that
> TajoMaster
> > >> keeps in main-memory. - the information should not be lost.
> > >>
> > >> I think that (1) and (2) are duplicated to TAJO-611 for service
> > discovery.
> > >> So, it would be nice if TAJO-704 should only focus on (3). It's
> because
> > >> TAJO-611 already started few weeks ago and TAJO-704 may be the
> > relatively
> > >> earlier stage. *Instead, you can continue the work with Xuhui and
> Min.*
> > >> Someone can divide the service discovery issue into more subtasks.
> > >>
> > >> In addition, I'd like to more discuss (3). Currently, a running
> > TajoMaster
> > >> keeps two information: cluster resource information of all workers and
> > >> catalog information. In order to guarantee the HA of the data,
> > TajoMaster
> > >> should either persistently materialize them or consistently
> synchronize
> > >> them across multiple TajoMasters. BTW, we will replace the resource
> > >> management feature of TajoMaster into a decentralized manner in new
> > >> scheduler issue. As a result, I think that TajoMaster HA needs to
> focus
> > on
> > >> only the high availability of catalog information. The HA of catalog
> can
> > >> be
> > >> easily achieved by database replication or we can make our own module
> > for
> > >> it. In my view, I prefer the former.
> > >>
> > >> Hi Xuhui and Min,
> > >>
> > >> Could you share the brief progress of service discovery issue? If so,
> we
> > >> can easily figure out how we start the service discovery together.
> > >>
> > >> Warm regards,
> > >> Hyunsik
> > >>
> > >>
> > >>
> > >> On Wed, Apr 16, 2014 at 3:36 PM, Min Zhou <coderplay@gmail.com>
> wrote:
> > >>
> > >> > Actually, we are not only thinking about the HA, but also service
> > >> discovery
> > >> > when the future tajo scheduler would rely on.  Tajo scheduler can
> get
> > >> all
> > >> > the active workers from that service.
> > >> >
> > >> >
> > >> > Regards,
> > >> > Min
> > >> >
> > >> >
> > >> > On Tue, Apr 15, 2014 at 10:05 PM, Xuhui Liu <mafish@gmail.com>
> wrote:
> > >> >
> > >> > > Hi Alvin,
> > >> > >
> > >> > > TAJO-611 will introduce Curator as a service discovery service
to
> > Tajo
> > >> > and
> > >> > > Curator is based on ZK. Maybe we can work together.
> > >> > >
> > >> > > Thanks,
> > >> > > Xuhui
> > >> > >
> > >> > >
> > >> > > On Wed, Apr 16, 2014 at 12:17 PM, Min Zhou <coderplay@gmail.com>
> > >> wrote:
> > >> > >
> > >> > > > HI Alvin,
> > >> > > >
> > >> > > > I think this jira has somewhat overlap with TAJO-611,  can
you
> > have
> > >> > some
> > >> > > > cooperation?
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Min
> > >> > > >
> > >> > > >
> > >> > > > On Tue, Apr 15, 2014 at 7:22 PM, Henry Saputra <
> > >> > henry.saputra@gmail.com
> > >> > > > >wrote:
> > >> > > >
> > >> > > > > Jaehwa, I think we should think about pluggable mechanism
that
> > >> would
> > >> > > > > allow some kind distributed system like ZK to be used
if
> wanted.
> > >> > > > >
> > >> > > > > - Henry
> > >> > > > >
> > >> > > > > On Tue, Apr 15, 2014 at 7:15 PM, Jaehwa Jung <
> > blrunner@apache.org
> > >> >
> > >> > > > wrote:
> > >> > > > > > Hi, Alvin
> > >> > > > > >
> > >> > > > > > I'm sorry for late response, and thank you very
much for
> your
> > >> > > > > contribution.
> > >> > > > > > I agree with your opinion for zookeeper. But,
zookeeper
> > >> requires an
> > >> > > > > > additional dependency that someone does not want.
> > >> > > > > >
> > >> > > > > > I'd like to suggest adding an abstraction layer
for handling
> > >> > > TajoMaster
> > >> > > > > HA.
> > >> > > > > > When I had created TAJO-740, I wished that TajoMaster
HA
> would
> > >> > have a
> > >> > > > > > generic interface and a basic implementation using
HDFS.
> Next,
> > >> your
> > >> > > > > > proposed zookeeper implementation will be added
there. It
> will
> > >> > allow
> > >> > > > > users
> > >> > > > > > to choice their desired implementation according
to their
> > >> > > environments.
> > >> > > > > >
> > >> > > > > > In addition, I'd like to propose that TajoMaster
embeds the
> HA
> > >> > > module,
> > >> > > > > and
> > >> > > > > > it would be great if HA works well by launching
a backup
> > >> > TajoMaster.
> > >> > > > > > Deploying additional process besides TajoMaster
and
> TajoWorker
> > >> > > > processes
> > >> > > > > > may give more burden to users.
> > >> > > > > >
> > >> > > > > > *Cheers*
> > >> > > > > > *Jaehwa*
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > 2014-04-13 14:36 GMT+09:00 Jihoon Son <jihoonson@apache.org
> >:
> > >> > > > > >
> > >> > > > > >> Hi Alvin.
> > >> > > > > >> Thanks for your suggestion.
> > >> > > > > >>
> > >> > > > > >> In overall, your suggestion looks very reasonable
to me!
> > >> > > > > >> I'll check the POC.
> > >> > > > > >>
> > >> > > > > >> Many thanks,
> > >> > > > > >> Jihoon
> > >> > > > > >> Hi All ,
> > >> > > > > >>             After doing lot of research in
my opinion we
> > should
> > >> > > > utilize
> > >> > > > > >> zookeeper for Tajo Master HA.I have created
a small POC and
> > >> shared
> > >> > > it
> > >> > > > > on my
> > >> > > > > >> Github repository ( git@github.com:
> > >> > alvinhenrick/zooKeeper-poc.git).
> > >> > > > > >>
> > >> > > > > >>             Just to make things little bit
easier and
> > >> > maintainable I
> > >> > > > am
> > >> > > > > >> utilizing Apache Curator the Fluent Zookeeper
Client API
> > >> >  developed
> > >> > > at
> > >> > > > > >> Netflix and is now part of an  apache open
source project.
> > >> > > > > >>
> > >> > > > > >>             I have attached the diagram to
convey my
> message
> > to
> > >> > the
> > >> > > > team
> > >> > > > > >> members.Will upload it to JIRA once everyone
agree with the
> > >> > proposed
> > >> > > > > >> solution.
> > >> > > > > >>
> > >> > > > > >>             Here is the flow going to look
like.
> > >> > > > > >>
> > >> > > > > >>             TajoMasterZkController   ==>
> > >> > > > > >>
> > >> > > > > >>
> > >> > > > > >>    1. This component  will start and connect
to zookeeper
> > >> quorum
> > >> > and
> > >> > > > > fight
> > >> > > > > >>       ( :) ) to obtain the latch / lock to
become the
> master
> > .
> > >> > > > > >>       2. Once the lock is obtained the Apache
Curator API
> > will
> > >> > > invoke
> > >> > > > > >>       takeLeadership () method at this time
will start the
> > >> > > TajoMaster.
> > >> > > > > >>       3. As long as the TajoMaster is running
the
> Controller
> > >> will
> > >> > > keep
> > >> > > > > the
> > >> > > > > >>       lock and update the meta data on zookeeper
server
> with
> > >> the
> > >> > > > > >> HOSTNAME and RPC
> > >> > > > > >>       PORT.
> > >> > > > > >>       4. The other participant will keep waiting
for the
> > latch/
> > >> > lock
> > >> > > > to
> > >> > > > > be
> > >> > > > > >>       released by zookeeper to obtain the
leadership.
> > >> > > > > >>       5. The advantage is we can have as many
Tajo Master's
> > as
> > >> we
> > >> > > > wan't
> > >> > > > > but
> > >> > > > > >>       only one can be the leader and will
consume the
> > resources
> > >> > only
> > >> > > > > after
> > >> > > > > >>       obtaining the latch/lock.
> > >> > > > > >>
> > >> > > > > >>
> > >> > > > > >>            TajoWorkerZkController ==>
> > >> > > > > >>
> > >> > > > > >>    1. This component  will start and connect
to zookeeper
> > (will
> > >> > > create
> > >> > > > > >>       EPHEMERAL ZNODE) and wait for the events
from
> > zookeeper.
> > >> > > > > >>       2. The first listener will listener
for successful
> > >> > > registration.
> > >> > > > > >>       3. The second listener on master node
will listen for
> > any
> > >> > > > >  changes to
> > >> > > > > >>       the master node received from zookeeper
server.
> > >> > > > > >>       4.  If the failover occurs the data
on the master
> ZNODE
> > >> will
> > >> > > be
> > >> > > > > >>       changed and the new HOSTNAME and RPC
PORT can be
> > obtained
> > >> > and
> > >> > > > the
> > >> > > > > >>       TajoWorker can establish the new RPC
connection with
> > the
> > >> > > > > TajoMaster.
> > >> > > > > >>
> > >> > > > > >>           To demonstrate I have created the
small
> Readme.txt
> > >> file
> > >> > > > > >> on Github on how to run the example. Please
read the log
> > >> > statements
> > >> > > on
> > >> > > > > the
> > >> > > > > >> console.
> > >> > > > > >>
> > >> > > > > >>           Similar to TajoWorkerZkController
we can also
> > >> > > > > >> implement TajoClientZkController.
> > >> > > > > >>
> > >> > > > > >>           Any help or advice is appreciated.
> > >> > > > > >>
> > >> > > > > >> Thanks!
> > >> > > > > >> Warm Regards,
> > >> > > > > >> Alvin.
> > >> > > > > >>
> > >> > > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > My research interests are distributed systems, parallel
> computing
> > >> and
> > >> > > > bytecode based virtual machine.
> > >> > > >
> > >> > > > My profile:
> > >> > > > http://www.linkedin.com/in/coderplay
> > >> > > > My blog:
> > >> > > > http://coderplay.javaeye.com
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > My research interests are distributed systems, parallel computing
> and
> > >> > bytecode based virtual machine.
> > >> >
> > >> > My profile:
> > >> > http://www.linkedin.com/in/coderplay
> > >> > My blog:
> > >> > http://coderplay.javaeye.com
> > >> >
> > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message