accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (ACCUMULO-294) tablet servers are losing zookeeper locks due to garbage collection even when there is lots of free memory
Date Mon, 30 Apr 2012 16:35:48 GMT

     [ https://issues.apache.org/jira/browse/ACCUMULO-294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eric Newton resolved ACCUMULO-294.
----------------------------------

    Resolution: Not A Problem
    
> tablet servers are losing zookeeper locks due to garbage collection even when there is
lots of free memory
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-294
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-294
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.3.5
>         Environment: tablet servers on a large cluster are losing their locks
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>            Priority: Minor
>
> Noticed that 5 tablet servers stopped on a large cluster.  Found that each server had
lost its lock due to a zookeeper session timeout. The zookeeper timeout is set to 40 seconds.
In all the cases, this lost lock was preceded by the ejection of blocks from the block cache,
and a garbage collection that recovered >4G of memory.  The tablet servers were running
with 8G, and were generally running with 4G free.  There was very little time attributed to
garbage collection, at least as printed in the debug log.  The in-memory map is small (256M)
and running the native version.  Will experiment with more aggressive concurrent GC settings:
> {noformat}
> -XX:CMSInitiatingOccupancyFraction=75
> {noformat}
> to
> {noformat}
> -XX:CMSInitiatingOccupancyFraction=60
> {noformat}
> Zookeeper has already been configured with this:
> {noformat}
> globalOutstandingLimit=10000
> {noformat}
> Which helped enormously.  Each zookeeper server has between 500 and 1700 clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message