accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <eric.new...@gmail.com>
Subject Re: Accumulo rolling restart
Date Fri, 17 Jan 2014 14:42:10 GMT
You can already shutdown a tablet server with the admin command.  In that
case, the master will move tablets off that server, using the normal
user-specified balancer, and eventually stop the tablet server.  Or you can
just forcibly restart it and just let the normal recovery process handle
the restart event.  The former will provide greater availability for any
one particular tablet, but the later is often faster overall.

The client library will hide any momentary offline tablet.

The ability to micro-manage tablet server assignment is the Balancer API.
The ability to automatically move tablets away from a server already exists.
Discovering new servers already exists.

-Eric



On Thu, Jan 16, 2014 at 11:51 PM, Vikram Srivastava <vikrams@cloudera.com>wrote:

> I don't want multiple TServers to host the same tablet. I'm looking for
> something like this:
>
> def moveTablet(Tablet tab, TServer tSrc, TServer tDest):
>    // Tells master to move tab from tSrc to tDest
>
>
> def restartTServer(TServer t1):
>
>   <Tell master to not assign any more tablets to t1>
>
>   Collection<Tablet> t1_tablets = t1.getTablets();
>   Map<Tablet, TServer> newLocations;
>
>   for tablets tab in t1_tablets:
>      moveTablet(tab, t1, <some other TServer "tDest" selected in
> round-robin manner>)
>      newLocations.put(tab, tDest)
>
>   <Restart t1 process>
>
>   for <tab, tDest> in newLocations.entries():
>      moveTablet(tab, tDest, t1)
>
>   <Tell master t1 is eligible again for new tablets>
>
>
> This way client faces interruptions at most twice (assuming nothing else
> fails) during rolling restart of all TServers.
>
>
> On Thu, Jan 16, 2014 at 7:35 PM, Josh Elser <josh.elser@gmail.com> wrote:
>
> > On Thu, Jan 16, 2014 at 10:00 PM, Vikram Srivastava
> > <vikrams@cloudera.com> wrote:
> > > Thanks for the replies. Couple of follow up questions -
> > > 1. What would the client experience during safe shutdown? Both a new
> > client
> > > trying to read a tablet on the TServer going down, and an existing
> client
> > > reading a table on the TServer that's going down.
> >
> > Existing client would talk to master to figure out the new assignment.
> > New client would do the same. Both would poll the master waiting for
> > the new assignment.
> >
> > > 2. The reason I wanted to know about a method for controlled
> > re-assignment
> > > while both TServers are running is so that I can bring the tablets back
> > to
> > > the original TServer, thereby ensuring that each tablet is unavailable
> > only
> > > once during the entire rolling restart process. If there is no such
> > method
> > > currently, I'd be happy to file a jira.
> >
> > Multiple hostings for a tablet would be considered a critical bug --
> > it should never happen. Balancing of tablets across tservers is
> > something that the master is often doing. You shouldn't have to do
> > anything but ensure the processes are running,
> >
> > >
> > >
> > > On Thu, Jan 16, 2014 at 6:21 PM, Eric Newton <eric.newton@gmail.com>
> > wrote:
> > >
> > >> And here's the command to do it:
> > >>
> > >> $ bin/accumulo admin stop server[:port]
> > >>
> > >>
> > >> But recovery is pretty fast... killing tservers can be faster than
> > doing an
> > >> orderly shutdown.
> > >>
> > >> With 1.4, if I was restarting nodes on several racks, I would kill the
> > >> loggers on a rack, flush all tables, and then restart all the tservers
> > and
> > >> loggers on that rack.  Rinse and repeat.
> > >>
> > >> -Eric
> > >>
> > >>
> > >>
> > >> On Thu, Jan 16, 2014 at 9:09 PM, John Vines <vines@apache.org> wrote:
> > >>
> > >> > You can stop an individual tserver which will do a safe shutdown of
> it
> > >> and
> > >> > reassignment. However, this won't work between releases due to
> > potential
> > >> > version changes.
> > >> >
> > >> > Sent from my phone, please pardon the typos and brevity.
> > >> > On Jan 16, 2014 8:47 PM, "Vikram Srivastava" <vikrams@cloudera.com>
> > >> wrote:
> > >> >
> > >> > > Hi,
> > >> > >
> > >> > > Is there a way to re-assign a tablet from one Tserver to another
> > while
> > >> > both
> > >> > > are running, in a manner so as to cause minimum impact to client?
> > >> > >
> > >> > > My motivation for this is to use something that does that to
do
> > rolling
> > >> > > restart of TServers.
> > >> > >
> > >> > > Thanks,
> > >> > >
> > >> > > Vikram
> > >> > >
> > >> >
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message