tajo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Min Zhou <coderp...@gmail.com>
Subject Re: JIRA-704 : TajoMaster High Availability .
Date Wed, 16 Apr 2014 17:22:23 GMT
Hi Alvin,

Thank you for your understanding.


Xuhui,
Could you please share your design and current progress here?


Thanks,
Min




On Wed, Apr 16, 2014 at 7:42 AM, Hyunsik Choi <hyunsik@apache.org> wrote:

> I'm sorry for late response, and thank you Alvin for your understanding.
>
> Best Regards,
> Hyunsik
>
>
> On Wed, Apr 16, 2014 at 11:19 PM, Alvin Henrick <share.code@aol.com>
> wrote:
>
> > Hi All ,
> >              Not a problem. I wasn't aware that 704 was overlapping with
> > 611.Yes, I was planning to use Apache Curator as well and did the small
> POC
> > and posted on Github. Apache Curator has the service discovery recipe
> which
> > we can use.
> >              As per hyunsik the only work left on 704 is Catalog
> > replication across TajoMaster's which can be easily achieved via database
> > replication.
> >
> >       Xuhui and Min ,
> >                                 Let me know If I can help because I have
> > done some good research on Apache Curator and Zookeeper (How to
> > utilize/configure apache curator api's ).
> >                                 Here is the Git repository where I did
> > some work git@github.com:alvinhenrick/zooKeeper-poc.git for 704 before
> > getting into the real implementation.
> >
> >               I will remove the in progress status and associate 704 with
> > 611 and move onto tackle another interesting/priority issue :). Let me
> know
> > guys how do you wan't to tackle this so that we don't duplicate the
> effort.
> >
> >               Have a wonderful day!!!
> >
> > Thanks!
> > Warm Regards,
> > Alvin.
> >
> >
> > On Apr 16, 2014, at 6:56 AM, Hyunsik Choi wrote:
> >
> > > Hi Alvin,
> > >
> > > First of all, thank you Alvin for your contribution. Your proposal
> looks
> > > nice and reasonable for me.
> > >
> > > BTW, as other guys mentioned, TAJO-704 and TAJO-611 seem to be somewhat
> > > overlapped to each other. We need to arrange the tasks to avoid
> > duplicated
> > > works.
> > >
> > > In my opinion, TajoMaster HA feature involves three sub features:
> > >  1) Leader election of multiple TajoMasters - One of multiple
> TajoMasters
> > > always is the leader TajoMaster.
> > >  2) Service discovery of TajoClient side - TajoClient API call should
> be
> > > resilient even though the original TajoMaster is not available.
> > >  3) Cluster resource management and Catalog information that TajoMaster
> > > keeps in main-memory. - the information should not be lost.
> > >
> > > I think that (1) and (2) are duplicated to TAJO-611 for service
> > discovery.
> > > So, it would be nice if TAJO-704 should only focus on (3). It's because
> > > TAJO-611 already started few weeks ago and TAJO-704 may be the
> relatively
> > > earlier stage. *Instead, you can continue the work with Xuhui and Min.*
> > > Someone can divide the service discovery issue into more subtasks.
> > >
> > > In addition, I'd like to more discuss (3). Currently, a running
> > TajoMaster
> > > keeps two information: cluster resource information of all workers and
> > > catalog information. In order to guarantee the HA of the data,
> TajoMaster
> > > should either persistently materialize them or consistently synchronize
> > > them across multiple TajoMasters. BTW, we will replace the resource
> > > management feature of TajoMaster into a decentralized manner in new
> > > scheduler issue. As a result, I think that TajoMaster HA needs to focus
> > on
> > > only the high availability of catalog information. The HA of catalog
> can
> > be
> > > easily achieved by database replication or we can make our own module
> for
> > > it. In my view, I prefer the former.
> > >
> > > Hi Xuhui and Min,
> > >
> > > Could you share the brief progress of service discovery issue? If so,
> we
> > > can easily figure out how we start the service discovery together.
> > >
> > > Warm regards,
> > > Hyunsik
> > >
> > >
> > >
> > > On Wed, Apr 16, 2014 at 3:36 PM, Min Zhou <coderplay@gmail.com> wrote:
> > >
> > >> Actually, we are not only thinking about the HA, but also service
> > discovery
> > >> when the future tajo scheduler would rely on.  Tajo scheduler can get
> > all
> > >> the active workers from that service.
> > >>
> > >>
> > >> Regards,
> > >> Min
> > >>
> > >>
> > >> On Tue, Apr 15, 2014 at 10:05 PM, Xuhui Liu <mafish@gmail.com> wrote:
> > >>
> > >>> Hi Alvin,
> > >>>
> > >>> TAJO-611 will introduce Curator as a service discovery service to
> Tajo
> > >> and
> > >>> Curator is based on ZK. Maybe we can work together.
> > >>>
> > >>> Thanks,
> > >>> Xuhui
> > >>>
> > >>>
> > >>> On Wed, Apr 16, 2014 at 12:17 PM, Min Zhou <coderplay@gmail.com>
> > wrote:
> > >>>
> > >>>> HI Alvin,
> > >>>>
> > >>>> I think this jira has somewhat overlap with TAJO-611,  can you
have
> > >> some
> > >>>> cooperation?
> > >>>>
> > >>>> Thanks,
> > >>>> Min
> > >>>>
> > >>>>
> > >>>> On Tue, Apr 15, 2014 at 7:22 PM, Henry Saputra <
> > >> henry.saputra@gmail.com
> > >>>>> wrote:
> > >>>>
> > >>>>> Jaehwa, I think we should think about pluggable mechanism that
> would
> > >>>>> allow some kind distributed system like ZK to be used if wanted.
> > >>>>>
> > >>>>> - Henry
> > >>>>>
> > >>>>> On Tue, Apr 15, 2014 at 7:15 PM, Jaehwa Jung <blrunner@apache.org>
> > >>>> wrote:
> > >>>>>> Hi, Alvin
> > >>>>>>
> > >>>>>> I'm sorry for late response, and thank you very much for
your
> > >>>>> contribution.
> > >>>>>> I agree with your opinion for zookeeper. But, zookeeper
requires
> an
> > >>>>>> additional dependency that someone does not want.
> > >>>>>>
> > >>>>>> I'd like to suggest adding an abstraction layer for handling
> > >>> TajoMaster
> > >>>>> HA.
> > >>>>>> When I had created TAJO-740, I wished that TajoMaster HA
would
> > >> have a
> > >>>>>> generic interface and a basic implementation using HDFS.
Next,
> your
> > >>>>>> proposed zookeeper implementation will be added there.
It will
> > >> allow
> > >>>>> users
> > >>>>>> to choice their desired implementation according to their
> > >>> environments.
> > >>>>>>
> > >>>>>> In addition, I'd like to propose that TajoMaster embeds
the HA
> > >>> module,
> > >>>>> and
> > >>>>>> it would be great if HA works well by launching a backup
> > >> TajoMaster.
> > >>>>>> Deploying additional process besides TajoMaster and TajoWorker
> > >>>> processes
> > >>>>>> may give more burden to users.
> > >>>>>>
> > >>>>>> *Cheers*
> > >>>>>> *Jaehwa*
> > >>>>>>
> > >>>>>>
> > >>>>>> 2014-04-13 14:36 GMT+09:00 Jihoon Son <jihoonson@apache.org>:
> > >>>>>>
> > >>>>>>> Hi Alvin.
> > >>>>>>> Thanks for your suggestion.
> > >>>>>>>
> > >>>>>>> In overall, your suggestion looks very reasonable to
me!
> > >>>>>>> I'll check the POC.
> > >>>>>>>
> > >>>>>>> Many thanks,
> > >>>>>>> Jihoon
> > >>>>>>> Hi All ,
> > >>>>>>>            After doing lot of research in my opinion
we should
> > >>>> utilize
> > >>>>>>> zookeeper for Tajo Master HA.I have created a small
POC and
> shared
> > >>> it
> > >>>>> on my
> > >>>>>>> Github repository ( git@github.com:
> > >> alvinhenrick/zooKeeper-poc.git).
> > >>>>>>>
> > >>>>>>>            Just to make things little bit easier and
> > >> maintainable I
> > >>>> am
> > >>>>>>> utilizing Apache Curator the Fluent Zookeeper Client
API
> > >> developed
> > >>> at
> > >>>>>>> Netflix and is now part of an  apache open source project.
> > >>>>>>>
> > >>>>>>>            I have attached the diagram to convey my
message to
> > >> the
> > >>>> team
> > >>>>>>> members.Will upload it to JIRA once everyone agree
with the
> > >> proposed
> > >>>>>>> solution.
> > >>>>>>>
> > >>>>>>>            Here is the flow going to look like.
> > >>>>>>>
> > >>>>>>>            TajoMasterZkController   ==>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>   1. This component  will start and connect to zookeeper
quorum
> > >> and
> > >>>>> fight
> > >>>>>>>      ( :) ) to obtain the latch / lock to become the
master .
> > >>>>>>>      2. Once the lock is obtained the Apache Curator
API will
> > >>> invoke
> > >>>>>>>      takeLeadership () method at this time will start
the
> > >>> TajoMaster.
> > >>>>>>>      3. As long as the TajoMaster is running the Controller
will
> > >>> keep
> > >>>>> the
> > >>>>>>>      lock and update the meta data on zookeeper server
with the
> > >>>>>>> HOSTNAME and RPC
> > >>>>>>>      PORT.
> > >>>>>>>      4. The other participant will keep waiting for
the latch/
> > >> lock
> > >>>> to
> > >>>>> be
> > >>>>>>>      released by zookeeper to obtain the leadership.
> > >>>>>>>      5. The advantage is we can have as many Tajo Master's
as we
> > >>>> wan't
> > >>>>> but
> > >>>>>>>      only one can be the leader and will consume the
resources
> > >> only
> > >>>>> after
> > >>>>>>>      obtaining the latch/lock.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>           TajoWorkerZkController ==>
> > >>>>>>>
> > >>>>>>>   1. This component  will start and connect to zookeeper
(will
> > >>> create
> > >>>>>>>      EPHEMERAL ZNODE) and wait for the events from
zookeeper.
> > >>>>>>>      2. The first listener will listener for successful
> > >>> registration.
> > >>>>>>>      3. The second listener on master node will listen
for any
> > >>>>> changes to
> > >>>>>>>      the master node received from zookeeper server.
> > >>>>>>>      4.  If the failover occurs the data on the master
ZNODE will
> > >>> be
> > >>>>>>>      changed and the new HOSTNAME and RPC PORT can
be obtained
> > >> and
> > >>>> the
> > >>>>>>>      TajoWorker can establish the new RPC connection
with the
> > >>>>> TajoMaster.
> > >>>>>>>
> > >>>>>>>          To demonstrate I have created the small Readme.txt
file
> > >>>>>>> on Github on how to run the example. Please read the
log
> > >> statements
> > >>> on
> > >>>>> the
> > >>>>>>> console.
> > >>>>>>>
> > >>>>>>>          Similar to TajoWorkerZkController we can also
> > >>>>>>> implement TajoClientZkController.
> > >>>>>>>
> > >>>>>>>          Any help or advice is appreciated.
> > >>>>>>>
> > >>>>>>> Thanks!
> > >>>>>>> Warm Regards,
> > >>>>>>> Alvin.
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> My research interests are distributed systems, parallel computing
> and
> > >>>> bytecode based virtual machine.
> > >>>>
> > >>>> My profile:
> > >>>> http://www.linkedin.com/in/coderplay
> > >>>> My blog:
> > >>>> http://coderplay.javaeye.com
> > >>>>
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> My research interests are distributed systems, parallel computing and
> > >> bytecode based virtual machine.
> > >>
> > >> My profile:
> > >> http://www.linkedin.com/in/coderplay
> > >> My blog:
> > >> http://coderplay.javaeye.com
> > >>
> >
> >
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message