accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <>
Subject [jira] [Reopened] (ACCUMULO-3887) Lack of insight into `accumulo admin stop $tserver`
Date Thu, 04 Jun 2015 21:56:38 GMT


Josh Elser reopened ACCUMULO-3887:

Nightly automated tests show that I screwed this one up pretty good. Didn't hex-encode the
sessionID from ZK on the client before sending it to the server.

> Lack of insight into `accumulo admin stop $tserver`
> ---------------------------------------------------
>                 Key: ACCUMULO-3887
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.7.0
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>             Fix For: 1.7.1, 1.8.0
>          Time Spent: 20m
>  Remaining Estimate: 0h
> Spent a good bit of time trying to figure out why the master _seemed_ to have shut down
a tabletserver for no reason. The best explanation I could come up with is as follows.
> * Client calls {{accumulo admin stop $host}}
> * TabletServer on $host gets restarted
> * Master seeds FATE op to stop $host using only the host:port
> * FATE op will kill the fresh tserver on $host
> The amount of time between steps 1 and 3 could be arbitrarily long, so this can be a
little problematic.
> One big thing we can do is to perform the sessionID calculation as early as possible
instead of deferring it into the Master. Thankfully, we can also handle this gracefully and
remain backwards compatible, so both of the following would work:
> * {{accumulo admin stop host:port}}
> * {{accumulo admin stop host:port\[session\]}}

This message was sent by Atlassian JIRA

View raw message