accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1408) unresponsive !METADATA table seen on a very large cluster
Date Tue, 04 Jun 2013 16:27:21 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674531#comment-13674531
] 

Eric Newton commented on ACCUMULO-1408:
---------------------------------------

It has been seen again, but it is not readily reproducible.

I'm thinking that we are getting client service starvation: requests are timing out before
getting serviced, so they are retried, which makes the situation worse.
                
> unresponsive !METADATA table seen on a very large cluster
> ---------------------------------------------------------
>
>                 Key: ACCUMULO-1408
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1408
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.4.3
>         Environment: large cluster
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>
> On a large cluster, saw an unresponsive !METADATA table; this resulted in hundreds of
simultaneous scans of the root tablet.  tservers did not look busy.  They had lots of data
available on their recv network queues, and nothing out their output queues.  Most of these
recv buffers did not change over time.  After killing these unresponsive servers the system
recovered.
> HDFS was about 80% full.
> accumulo was running over CDH3u5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message