accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Schmidt (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-4615) ThreadPool timeout when checking tserver stats is confusing
Date Thu, 22 Feb 2018 19:11:00 GMT


Jeff Schmidt commented on ACCUMULO-4615:

Sorry for the delay on this. I have an initial fix here: []

I will be testing it on a deployed system shortly but any early feedback is appreciated too.

The general idea is to 

1) Use a timeout per status gathering task (instead of a timeout for the entire pool)
2) Changed the status gather results to a threadsafe data structure (ConcurrentSkipListMap)
3) Added separate property for the status timeout (per tserver)

> ThreadPool timeout when checking tserver stats is confusing
> -----------------------------------------------------------
>                 Key: ACCUMULO-4615
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.8.1
>            Reporter: Michael Wall
>            Assignee: Jeff Schmidt
>            Priority: Minor
>             Fix For: 1.9.0, 2.0.0
> If it takes longer than the configured time to gather information from all the tablet
servers, the thread pool stops and processing continues with whatever has been collected.
 Code is,
default timeout is 6s.  Does not appear to be an issue prior to 1.8.
> Best case, this was really confusing.  The monitor page would have 30 tservers, then
5 tservers.  Didn't really see any other negative effects, no migrations and no balancing
appeared to be affected.  Worse case though, I missed something and the master is making decisions
based on incomplete information.
> [] please add more info if needed.

This message was sent by Atlassian JIRA

View raw message