hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2
Date Mon, 20 Jul 2015 06:34:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633089#comment-14633089
] 

Duo Zhang commented on HDFS-7966:
---------------------------------

Write a single threaded testcase that do all the test works inside event loop.

https://github.com/Apache9/hadoop/blob/HDFS-7966-POC/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/dtp/TestHttp2ReadBlockInsideEventLoop.java

And at server side, I remove the thread pool in {{ReadBlockHandler}}.

The result is
{noformat}
******* time based on tcp 17734ms
******* time based on http2 20019ms

******* time based on tcp 18878ms
******* time based on http2 21422ms

******* time based on tcp 17562ms
******* time based on http2 20568ms

******* time based on tcp 18726ms
******* time based on http2 20251ms

******* time based on tcp 18632ms
******* time based on http2 21227ms
{noformat}

The average time of original tcp is 18306.4ms, and HTTP/2 is 20697.4ms. 

20697.4 / 18306.4 = 1.13, so HTTP/2 is 13% slower than tcp. In the above test it is 30% slower,
so I think context switch maybe one of the problem why HTTP/2 is much slower than tcp. Will
do this test on a real cluster to get more data.

And the one {{EventLoop}} per datanode problem, I think it is a problem on a small cluster.
So I think we should allow creating multiple HTTP/2 connections to one datanode. I will modify
{{Http2ConnectionPool}} and do the test again.

Thanks.

> New Data Transfer Protocol via HTTP/2
> -------------------------------------
>
>                 Key: HDFS-7966
>                 URL: https://issues.apache.org/jira/browse/HDFS-7966
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Haohui Mai
>            Assignee: Qianqian Shi
>              Labels: gsoc, gsoc2015, mentor
>         Attachments: GSoC2015_Proposal.pdf, TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg
>
>
> The current Data Transfer Protocol (DTP) implements a rich set of features that span
across multiple layers, including:
> * Connection pooling and authentication (session layer)
> * Encryption (presentation layer)
> * Data writing pipeline (application layer)
> All these features are HDFS-specific and defined by implementation. As a result it requires
non-trivial amount of work to implement HDFS clients and servers.
> This jira explores to delegate the responsibilities of the session and presentation layers
to the HTTP/2 protocol. Particularly, HTTP/2 handles connection multiplexing, QoS, authentication
and encryption, reducing the scope of DTP to the application layer only. By leveraging the
existing HTTP/2 library, it should simplify the implementation of both HDFS clients and servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message