hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2
Date Wed, 08 Jul 2015 05:08:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617985#comment-14617985

Duo Zhang commented on HDFS-7966:

This is the worst scenario for testing a NIO framework I think. NIO is consider to have less
threads than OIO,  but in this test, NIO at least needs 4 threads and OIO only needs 2. You
can see that the context switching costs a lot in the flame graph(ThreadPoolExecutor related
operations, EventLoop.execute, selector.wakeup, etc.). And the buffer pooling here is also
redundant. In OIO, one buffer for server and one buffer for client. At last, I think test
through localhost can make things worse since now the network speed and latency are not bottleneck
any more.

I plan to test these things next:
1. Read a large block(256MB or more)
2. Simulate the scenario that datanode caches a lot of connections from different machine
and only a few of them read at the same time.
3. Run all tests on a real cluster(which means read data from other machine).


> New Data Transfer Protocol via HTTP/2
> -------------------------------------
>                 Key: HDFS-7966
>                 URL: https://issues.apache.org/jira/browse/HDFS-7966
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Haohui Mai
>            Assignee: Qianqian Shi
>              Labels: gsoc, gsoc2015, mentor
>         Attachments: GSoC2015_Proposal.pdf, TestHttp2Performance.svg
> The current Data Transfer Protocol (DTP) implements a rich set of features that span
across multiple layers, including:
> * Connection pooling and authentication (session layer)
> * Encryption (presentation layer)
> * Data writing pipeline (application layer)
> All these features are HDFS-specific and defined by implementation. As a result it requires
non-trivial amount of work to implement HDFS clients and servers.
> This jira explores to delegate the responsibilities of the session and presentation layers
to the HTTP/2 protocol. Particularly, HTTP/2 handles connection multiplexing, QoS, authentication
and encryption, reducing the scope of DTP to the application layer only. By leveraging the
existing HTTP/2 library, it should simplify the implementation of both HDFS clients and servers.

This message was sent by Atlassian JIRA

View raw message