Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Mon, 6 Apr 2015 22:49:12 +0000 (UTC)
From: "Colin Patrick McCabe (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12818679.1428349393000.8609.1428360552798@Atlassian.JIRA>
In-Reply-To: <JIRA.12818679.1428349393000@Atlassian.JIRA>
References: <JIRA.12818679.1428349393000@Atlassian.JIRA>
 <JIRA.12818679.1428349393131@arcas>
Subject: [jira] [Comment Edited] (HDFS-8069) Tracing implementation on
 DFSInputStream seriously degrades performance
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482120#comment-14482120 ] 

Colin Patrick McCabe edited comment on HDFS-8069 at 4/6/15 10:48 PM:
---------------------------------------------------------------------

I can think of a few ways to solve issue #1:

1. Disable tracing in Hadoop, by setting {{hadoop.htrace.sampler}} to {{NeverSampler}}.  Needless to say, this will allow you to get tracing from Accumlo, which you have currently, but not Hadoop.  So it's not a regression but it won't give you additional functionality.

2. Send the trace spans to a different Accumlo instance than the one you are tracing.  The different Accumlo instance can have tracing turned off (both Accumlo tracing and Hadoop tracing) and so avoid the amplification effect.

3. Just use htraced.  We could add security to htraced if that is a concern.

I wonder if we could simply have Accumlo use a shim API that we could later change over to call HTrace under the covers, once these issues have been worked out.  I'm a little concerned that we may want to change the HTrace API in the future and we might find that Accumlo has done some stuff we weren't expecting with it.  What do you think?


was (Author: cmccabe):
I can think of a few ways to solve issue #1:

1. Disable tracing in Hadoop, by setting {{hadoop.htrace.sampler}} to {[NeverSampler}}.  Needless to say, this will allow you to get tracing from Accumlo, which you have currently, but not Hadoop.  So it's not a regression but it won't give you additional functionality.

2. Send the trace spans to a different Accumlo instance than the one you are tracing.  The different Accumlo instance can have tracing turned off (both Accumlo tracing and Hadoop tracing) and so avoid the amplification effect.

3. Just use htraced.  We could add security to htraced if that is a concern.

I wonder if we could simply have Accumlo use a shim API that we could later change over to call HTrace under the covers, once these issues have been worked out.  I'm a little concerned that we may want to change the HTrace API in the future and we might find that Accumlo has done some stuff we weren't expecting with it.  What do you think?

> Tracing implementation on DFSInputStream seriously degrades performance
> -----------------------------------------------------------------------
>
>                 Key: HDFS-8069
>                 URL: https://issues.apache.org/jira/browse/HDFS-8069
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.7.0
>            Reporter: Josh Elser
>            Priority: Critical
>
> I've been doing some testing of Accumulo with HDFS 2.7.0 and have noticed a serious performance impact when Accumulo registers itself as a SpanReceiver.
> The context of the test which I noticed the impact is that an Accumulo process reads a series of updates from a write-ahead log. This is just reading a series of Writable objects from a file in HDFS. With tracing enabled, I waited for at least 10 minutes and the server still hadn't read a ~300MB file.
> Doing a poor-man's inspection via repeated thread dumps, I always see something like the following:
> {noformat}
> "replication task 2" daemon prio=10 tid=0x0000000002842800 nid=0x794d runnable [0x00007f6c7b1ec000]
>    java.lang.Thread.State: RUNNABLE
>         at java.util.concurrent.CopyOnWriteArrayList.iterator(CopyOnWriteArrayList.java:959)
>         at org.apache.htrace.Tracer.deliver(Tracer.java:80)
>         at org.apache.htrace.impl.MilliSpan.stop(MilliSpan.java:177)
>         - locked <0x000000077a770730> (a org.apache.htrace.impl.MilliSpan)
>         at org.apache.htrace.TraceScope.close(TraceScope.java:78)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:898)
>         - locked <0x000000079fa39a48> (a org.apache.hadoop.hdfs.DFSInputStream)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:697)
>         - locked <0x000000079fa39a48> (a org.apache.hadoop.hdfs.DFSInputStream)
>         at java.io.DataInputStream.readByte(DataInputStream.java:265)
>         at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
>         at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
>         at org.apache.accumulo.core.data.Mutation.readFields(Mutation.java:951)
>        ... more accumulo code omitted...
> {noformat}
> What I'm seeing here is that reading a single byte (in WritableUtils.readVLong) is causing a new Span creation and close (which includes a flush to the SpanReceiver). This results in an extreme amount of spans for {{DFSInputStream.byteArrayRead}} just for reading a file from HDFS -- over 700k spans for just reading a few hundred MB file.
> Perhaps there's something different we need to do for the SpanReceiver in Accumulo? I'm not entirely sure, but this was rather unexpected.
> cc/ [~cmccabe]


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)