hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-8069) Tracing implementation on DFSInputStream seriously degrades performance
Date Mon, 06 Apr 2015 19:44:12 GMT
Josh Elser created HDFS-8069:
--------------------------------

             Summary: Tracing implementation on DFSInputStream seriously degrades performance
                 Key: HDFS-8069
                 URL: https://issues.apache.org/jira/browse/HDFS-8069
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs-client
    Affects Versions: 2.7.0
            Reporter: Josh Elser
            Priority: Critical


I've been doing some testing of Accumulo with HDFS 2.7.0 and have noticed a serious performance
impact when Accumulo registers itself as a SpanReceiver.

The context of the test which I noticed the impact is that an Accumulo process reads a series
of updates from a write-ahead log. This is just reading a series of Writable objects from
a file in HDFS. With tracing enabled, I waited for at least 10 minutes and the server still
hadn't read a ~300MB file.

Doing a poor-man's inspection via repeated thread dumps, I always see something like the following:

{noformat}
"replication task 2" daemon prio=10 tid=0x0000000002842800 nid=0x794d runnable [0x00007f6c7b1ec000]
   java.lang.Thread.State: RUNNABLE
        at java.util.concurrent.CopyOnWriteArrayList.iterator(CopyOnWriteArrayList.java:959)
        at org.apache.htrace.Tracer.deliver(Tracer.java:80)
        at org.apache.htrace.impl.MilliSpan.stop(MilliSpan.java:177)
        - locked <0x000000077a770730> (a org.apache.htrace.impl.MilliSpan)
        at org.apache.htrace.TraceScope.close(TraceScope.java:78)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:898)
        - locked <0x000000079fa39a48> (a org.apache.hadoop.hdfs.DFSInputStream)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:697)
        - locked <0x000000079fa39a48> (a org.apache.hadoop.hdfs.DFSInputStream)
        at java.io.DataInputStream.readByte(DataInputStream.java:265)
        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
        at org.apache.accumulo.core.data.Mutation.readFields(Mutation.java:951)
       ... more accumulo code omitted...
{noformat}

What I'm seeing here is that reading a single byte (in WritableUtils.readVLong) is causing
a new Span creation and close (which includes a flush to the SpanReceiver). This results in
an extreme amount of spans for {{DFSInputStream.byteArrayRead}} just for reading a file from
HDFS -- over 700k spans for just reading a few hundred MB file.

Perhaps there's something different we need to do for the SpanReceiver in Accumulo? I'm not
entirely sure, but this was rather unexpected.

cc/ [~cmccabe]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message