accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Drob (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-2868) Make master configurable in when it kills tablet servers
Date Thu, 19 Jun 2014 15:42:24 GMT


Mike Drob commented on ACCUMULO-2868:

Todd outlines some more [advanced logic|]
for HDFS deciding when to mark a node as dead, rather than just X retries * Y seconds.

> Make master configurable in when it kills tablet servers
> --------------------------------------------------------
>                 Key: ACCUMULO-2868
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: master
>    Affects Versions: 1.6.0
>            Reporter: Bill Havanki
>              Labels: admin, configuration, master
> On a cluster with a flaky network, the master may be unable to contact a tserver for
some moderate amount of time and then direct it to terminate, even though the tserver is still
up. (See {{gatherTableInformation()}} and {{StatusThread}}. It does not appear possible to
configure the master to be more forgiving in these checks. Relevant constants:
> * {{DEFAULT_WAIT_FOR_WATCHER}} - interval between server checks
> * {{MAX_BAD_STATUS_COUNT}} - the maximum number of failed attempts allowed before killing
the tserver
> Making one or both of those configurable, or some other pertinent parameter configurable,
would allow cluster admins to cope with mild network maladies. 

This message was sent by Atlassian JIRA

View raw message