hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8069) Tracing implementation on DFSInputStream seriously degrades performance
Date Tue, 07 Apr 2015 00:23:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482274#comment-14482274
] 

Josh Elser commented on HDFS-8069:
----------------------------------

With regards to your other points:

Comments to solutions on point 1:
# As Billie said, we're not tracing the tracing code :). 
# A non-starter for me. We've had distributed tracing support built into Accumulo for years
without issue. To suddenly inform users that they need to spin up a second cluster is a no-go.
# If htraced had support for Accumulo as a backing store, I'd jump for joy. But, running one
big-table application at a time is more than enough for me. Security isn't really relevant
here -- there's more to Accumulo than just the security aspect. Kind of goes back to point
2: we have this support internally to Accumulo for some time. We really want to see it transparently
go down through HDFS for the added insight.

Point 2:
Again, I think Billie got this already: this was caused by the tracing of a single operation.
The traced operation in Accumulo read a file off of disk. Performance tanked due to excessive
spans from one parent span.

bq. I wonder if we could simply have Accumlo use a shim API that we could later change over
to call HTrace under the covers, once these issues have been worked out. I'm a little concerned
that we may want to change the HTrace API in the future and we might find that Accumlo has
done some stuff we weren't expecting with it. What do you think?

It would certainly be much nicer to get rid of our tracer sink code and push it up into HTrace.
Catching API changes early (instead of after a new HTrace version was released and Accumulo
tried to use it) is ideal. Perhaps this is something we can start considering. The other side
of the coin is that we could (will) be a good consumer that will try to hold you to some semblance
of a stable API. Either way, a good discussion we can have over in HTrace rather than here
:)

> Tracing implementation on DFSInputStream seriously degrades performance
> -----------------------------------------------------------------------
>
>                 Key: HDFS-8069
>                 URL: https://issues.apache.org/jira/browse/HDFS-8069
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.7.0
>            Reporter: Josh Elser
>            Priority: Critical
>
> I've been doing some testing of Accumulo with HDFS 2.7.0 and have noticed a serious performance
impact when Accumulo registers itself as a SpanReceiver.
> The context of the test which I noticed the impact is that an Accumulo process reads
a series of updates from a write-ahead log. This is just reading a series of Writable objects
from a file in HDFS. With tracing enabled, I waited for at least 10 minutes and the server
still hadn't read a ~300MB file.
> Doing a poor-man's inspection via repeated thread dumps, I always see something like
the following:
> {noformat}
> "replication task 2" daemon prio=10 tid=0x0000000002842800 nid=0x794d runnable [0x00007f6c7b1ec000]
>    java.lang.Thread.State: RUNNABLE
>         at java.util.concurrent.CopyOnWriteArrayList.iterator(CopyOnWriteArrayList.java:959)
>         at org.apache.htrace.Tracer.deliver(Tracer.java:80)
>         at org.apache.htrace.impl.MilliSpan.stop(MilliSpan.java:177)
>         - locked <0x000000077a770730> (a org.apache.htrace.impl.MilliSpan)
>         at org.apache.htrace.TraceScope.close(TraceScope.java:78)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:898)
>         - locked <0x000000079fa39a48> (a org.apache.hadoop.hdfs.DFSInputStream)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:697)
>         - locked <0x000000079fa39a48> (a org.apache.hadoop.hdfs.DFSInputStream)
>         at java.io.DataInputStream.readByte(DataInputStream.java:265)
>         at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
>         at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
>         at org.apache.accumulo.core.data.Mutation.readFields(Mutation.java:951)
>        ... more accumulo code omitted...
> {noformat}
> What I'm seeing here is that reading a single byte (in WritableUtils.readVLong) is causing
a new Span creation and close (which includes a flush to the SpanReceiver). This results in
an extreme amount of spans for {{DFSInputStream.byteArrayRead}} just for reading a file from
HDFS -- over 700k spans for just reading a few hundred MB file.
> Perhaps there's something different we need to do for the SpanReceiver in Accumulo? I'm
not entirely sure, but this was rather unexpected.
> cc/ [~cmccabe]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message