accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikram Srivastava <vikr...@cloudera.com>
Subject Re: Accumulo rolling restart
Date Fri, 17 Jan 2014 04:51:33 GMT
I don't want multiple TServers to host the same tablet. I'm looking for
something like this:

def moveTablet(Tablet tab, TServer tSrc, TServer tDest):
   // Tells master to move tab from tSrc to tDest


def restartTServer(TServer t1):

  <Tell master to not assign any more tablets to t1>

  Collection<Tablet> t1_tablets = t1.getTablets();
  Map<Tablet, TServer> newLocations;

  for tablets tab in t1_tablets:
     moveTablet(tab, t1, <some other TServer "tDest" selected in
round-robin manner>)
     newLocations.put(tab, tDest)

  <Restart t1 process>

  for <tab, tDest> in newLocations.entries():
     moveTablet(tab, tDest, t1)

  <Tell master t1 is eligible again for new tablets>


This way client faces interruptions at most twice (assuming nothing else
fails) during rolling restart of all TServers.


On Thu, Jan 16, 2014 at 7:35 PM, Josh Elser <josh.elser@gmail.com> wrote:

> On Thu, Jan 16, 2014 at 10:00 PM, Vikram Srivastava
> <vikrams@cloudera.com> wrote:
> > Thanks for the replies. Couple of follow up questions -
> > 1. What would the client experience during safe shutdown? Both a new
> client
> > trying to read a tablet on the TServer going down, and an existing client
> > reading a table on the TServer that's going down.
>
> Existing client would talk to master to figure out the new assignment.
> New client would do the same. Both would poll the master waiting for
> the new assignment.
>
> > 2. The reason I wanted to know about a method for controlled
> re-assignment
> > while both TServers are running is so that I can bring the tablets back
> to
> > the original TServer, thereby ensuring that each tablet is unavailable
> only
> > once during the entire rolling restart process. If there is no such
> method
> > currently, I'd be happy to file a jira.
>
> Multiple hostings for a tablet would be considered a critical bug --
> it should never happen. Balancing of tablets across tservers is
> something that the master is often doing. You shouldn't have to do
> anything but ensure the processes are running,
>
> >
> >
> > On Thu, Jan 16, 2014 at 6:21 PM, Eric Newton <eric.newton@gmail.com>
> wrote:
> >
> >> And here's the command to do it:
> >>
> >> $ bin/accumulo admin stop server[:port]
> >>
> >>
> >> But recovery is pretty fast... killing tservers can be faster than
> doing an
> >> orderly shutdown.
> >>
> >> With 1.4, if I was restarting nodes on several racks, I would kill the
> >> loggers on a rack, flush all tables, and then restart all the tservers
> and
> >> loggers on that rack.  Rinse and repeat.
> >>
> >> -Eric
> >>
> >>
> >>
> >> On Thu, Jan 16, 2014 at 9:09 PM, John Vines <vines@apache.org> wrote:
> >>
> >> > You can stop an individual tserver which will do a safe shutdown of it
> >> and
> >> > reassignment. However, this won't work between releases due to
> potential
> >> > version changes.
> >> >
> >> > Sent from my phone, please pardon the typos and brevity.
> >> > On Jan 16, 2014 8:47 PM, "Vikram Srivastava" <vikrams@cloudera.com>
> >> wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > Is there a way to re-assign a tablet from one Tserver to another
> while
> >> > both
> >> > > are running, in a manner so as to cause minimum impact to client?
> >> > >
> >> > > My motivation for this is to use something that does that to do
> rolling
> >> > > restart of TServers.
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Vikram
> >> > >
> >> >
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message