hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billie Rinaldi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8213) DFSClient should not instantiate SpanReceiverHost
Date Wed, 22 Apr 2015 19:58:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507771#comment-14507771

Billie Rinaldi commented on HDFS-8213:

The hadoop.htrace.span.receiver.classes is not set in Accumulo configuration files, but it
is set in Hadoop configuration files.  Accumulo uses Hadoop configuration files to connect
to HDFS, thus its uses of DFSClient will have Hadoop's hadoop.htrace.span.receiver.classes.
 HBase does something similar, I believe.

bq. Plus, it just kicks the problem up to a higher level. If my FooProcess wants to use both
HTrace and Accumulo, FooProcess could easily make the same argument that "Accumulo should
not instantiate SpanReceiverHost" since FooProcess is already doing that. And since FooProcess
uses the accumulo client, it would conflict with whatever accumulo was configuring, if the
same config file was used for everything.

No.  The way it works (did work, until this change was introduced in DFSClient) is that server
processes instantiate SpanReceiverHost.  If an app wants tracing, it also has to instantiate
SpanReceiverHost.  The Accumulo client does not instantiate SPH itself, as DFSClient should

bq. One thing we could do to make this a little less painful is to deduplicate span receivers
inside the library. So if both DFSClient and Accumlo requested an HTracedSpanReceiver, we
could simply create one instance of that. This would allow us to use the same config file
for everything.

The change in DFSClient changes how apps are supposed to use tracing.  It seems like this
would be mitigated by deduping SpanReceivers in htrace, but if we go that route I would like
the DFSClient change to be reverted until HDFS moves to a version of htrace with deduping.
 Otherwise, Accumulo and HBase will have to leave HDFS tracing disabled, or change how they're
configuring HDFS, if they wish to avoid double delivery of spans.

> DFSClient should not instantiate SpanReceiverHost
> -------------------------------------------------
>                 Key: HDFS-8213
>                 URL: https://issues.apache.org/jira/browse/HDFS-8213
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.7.0
>            Reporter: Billie Rinaldi
>            Assignee: Brahma Reddy Battula
>            Priority: Critical
> DFSClient initializing SpanReceivers is a problem for Accumulo, which manages SpanReceivers
through its own configuration.  This results in the same receivers being registered multiple
times and spans being delivered more than once.  The documentation says SpanReceiverHost.getInstance
should be issued once per process, so there is no expectation that DFSClient should do this.

This message was sent by Atlassian JIRA

View raw message