hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Elek, Marton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15566) Remove HTrace support
Date Mon, 10 Dec 2018 08:49:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16714435#comment-16714435
] 

Elek, Marton commented on HADOOP-15566:
---------------------------------------

Thanks [~cmccabe], I agree with your points about the importance of the compatibility and
to keep the htrace support.

My proposal is:

1.) Create a lightweight Hadoop API for the tracing where multiple implementation can be plugged
in

2.) Provide a default implementation which uses the existing htrace code.

Implementation details:

a) Add a new optional bytes field for the RpcHeader. Different tracing libraries could require
different size of serialized context:
{code:java}
diff --git a/hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto b/hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto
index aa146162896..e42f64eb631 100644
--- a/hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto
+++ b/hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto
@@ -61,9 +61,9 @@ enum RpcKindProto {
  * what span caused the new span we will create when this message is received.
  */
 message RPCTraceInfoProto {
    optional int64 traceId = 1; // parentIdHigh
    optional int64 parentId = 2; // parentIdLow

+    optional bytes tracingContext = 3; //generic tracingInformation
 }
{code}
This is a a backward-compatible change.

b) In the rpc Server.java a (htrace) TraceScope is initialized based on the rpc header and
propagated as part of the RpcCall:
{code:java}
      RpcCall call = new RpcCall(this, header.getCallId(),
          header.getRetryCount(), rpcRequest,
          ProtoUtil.convert(header.getRpcKind()),
          header.getClientId().toByteArray(), traceScope, callerContext);
{code}
I propose to replace this traceScope with a hadoop specific TraceScope marker interface. The
default implementation could be a simple class which contains the htrace implementation.

c. We can create a simple Tracing singleton (similar to the DefaultMetricsSystem):

Example call:
{code:java}
          try (TracingSpan context = HadoopTracing.INSTANCE.newContext(call.tracingSpan, "RpcServerCall"))
{
            if (remoteUser != null) {
              remoteUser.doAs(call);
            } else {
              call.run();
            }
}
{code}
d. HadoopTracing could be something like this:
{code:java}
package org.apache.hadoop.tracing;

public enum HadoopTracing {
  INSTANCE;

  private TracingProvider provider;

  public TracingSpan importContext(byte[] data) {
    return provider.importContext(data);
  }

  public byte[] exportContext() {
    return provider.exportContext();
  }

  public TracingSpan newContext(String name) {
    return provider.newContext(name);
  }

  public TracingSpan newContext(TracingSpan parentSpan, String name) {
    return null;
  }
}
{code}
e. We can add multiple TracingProvider (and provide one for Htrace for compatibility reason.)

+1. Personally I prefer to use some utility which adds trace support to specific methods which
are annotated. It could simplify the usage of the tracing but requires java proxy. But this
is an independent question.

> Remove HTrace support
> ---------------------
>
>                 Key: HADOOP-15566
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15566
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: metrics
>    Affects Versions: 3.1.0
>            Reporter: Todd Lipcon
>            Priority: Major
>              Labels: security
>         Attachments: Screen Shot 2018-06-29 at 11.59.16 AM.png, ss-trace-s3a.png
>
>
> The HTrace incubator project has voted to retire itself and won't be making further releases.
The Hadoop project currently has various hooks with HTrace. It seems in some cases (eg HDFS-13702)
these hooks have had measurable performance overhead. Given these two factors, I think we
should consider removing the HTrace integration. If there is someone willing to do the work,
replacing it with OpenTracing might be a better choice since there is an active community.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message