hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-11622) TraceId hardcoded to 0 in DataStreamer, correlation between multiple spans is lost
Date Fri, 07 Apr 2017 18:36:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961241#comment-15961241
] 

Andrew Purtell edited comment on HDFS-11622 at 4/7/17 6:36 PM:
---------------------------------------------------------------

bq. I appreciate it if there are specific problems in downstream projects like Phoenix, otherwise
we should be conservative to fix core DFS code in maintenance branch.

[~iwasakims] The spans in HDFS can have multiple parents because of batching in the HBase
WAL, which Phoenix uses as platform. An interesting case. I think if this improvement is not
committed to the shipping versions of Hadoop then Phoenix tracing, based on HTrace, won't
work correctly and cannot be reliably used. HTrace is promising, but it needs to work correctly
holistically with the whole stack or is rendered moot. 


was (Author: apurtell):
bq. I appreciate it if there are specific problems in downstream projects like Phoenix, otherwise
we should be conservative to fix core DFS code in maintenance branch.

[~masatana] The spans in HDFS can have multiple parents because of batching in the HBase WAL,
which Phoenix uses as platform. An interesting case. I think if this improvement is not committed
to the shipping versions of Hadoop then Phoenix tracing, based on HTrace, won't work correctly
and cannot be reliably used. HTrace is promising, but it needs to work correctly holistically
with the whole stack or is rendered moot. 

> TraceId hardcoded to 0 in DataStreamer, correlation between multiple spans is lost
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-11622
>                 URL: https://issues.apache.org/jira/browse/HDFS-11622
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: tracing
>            Reporter: Karan Mehta
>
> In the {{run()}} method of {{DataStreamer}} class, the following code is written. {{parents\[0\]}}
refer to the {{spanId}} of the parent span.
> {code}
>               one = dataQueue.getFirst(); // regular data packet
>               long parents[] = one.getTraceParents();
>               if (parents.length > 0) {
>                      scope = Trace.startSpan("dataStreamer", new TraceInfo(0, parents[0]));
>                 // TODO: use setParents API once it's available from HTrace 3.2
>                 // scope = Trace.startSpan("dataStreamer", Sampler.ALWAYS);
>                 // scope.getSpan().setParents(parents);
>               }
> {code}
> The {{scope}} starts a new TraceSpan with a traceId hardcoded to 0. Ideally it should
be taken when {{currentPacket.addTraceParent(Trace.currentSpan())}} is invoked. This JIRA
is to propose an additional long field inside the {{DFSPacket}} class which holds the parent
{{traceId}}. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message