accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ACCUMULO-3396) HDFS reads are haning
Date Tue, 09 Dec 2014 18:37:12 GMT
Eric Newton created ACCUMULO-3396:
-------------------------------------

             Summary: HDFS reads are haning
                 Key: ACCUMULO-3396
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3396
             Project: Accumulo
          Issue Type: Bug
          Components: tserver
    Affects Versions: 1.6.1, 1.6.0
         Environment: rhel6 linux 2.6.32-279 (x86_64)
java 1.7.0_67-b01
hadoop CDH5.1.2, HA (2) federated (2) NN configuration
large production cluster

            Reporter: Eric Newton
            Assignee: Eric Newton
            Priority: Blocker


On large clusters we are seeing various forms of HDFS reads hanging:

Queries that never return.
Major compactions that hang.

Accumulo 1.6.1 incorporates detectors that report hanging major compactions and a monitor
display that reports scans by age.

Stack traces show readers in sun.nio.ch.EPollArrayWrapper.epollWait and in org.apache.hadoop.ipc.Client.Call(Client.java:1362).

Netstat results for the tablet server shows many connections with a single byte waiting on
the Recv-Q of the process, and no bytes waiting on the Send-Q.

strace of the jvm shows the typical jvm thread noise (futex calls)

jstack shows lots of read-requests to the NN.

long-running MajC's do complete, albeit slowly.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message