hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haohui Mai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2
Date Tue, 12 May 2015 17:28:02 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540286#comment-14540286

Haohui Mai commented on HDFS-7966:

Let me try to answer the questions.

There are two reasons that why the proposal needs to diverge from grpc: (1) grpc requires
the response to be fitted into a protobuf message where a large read request (> 2GB) fails
to fit in, and (2) the write cannot be a single streaming rpc as the protocol needs to implement
hflush() and hsync() as well. 

Note that evolving the read path is relatively straightforward as the implementation only
needs to provide another implementation of {{BlockReader}}. The write path, however, might
require implementing a new {{DFSOutputStream}}.

There should be no new port required -- the plan is to listen on the HTTP/HTTPS port that
is available on DN today.

> New Data Transfer Protocol via HTTP/2
> -------------------------------------
>                 Key: HDFS-7966
>                 URL: https://issues.apache.org/jira/browse/HDFS-7966
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Haohui Mai
>            Assignee: Qianqian Shi
>              Labels: gsoc, gsoc2015, mentor
>         Attachments: GSoC2015_Proposal.pdf
> The current Data Transfer Protocol (DTP) implements a rich set of features that span
across multiple layers, including:
> * Connection pooling and authentication (session layer)
> * Encryption (presentation layer)
> * Data writing pipeline (application layer)
> All these features are HDFS-specific and defined by implementation. As a result it requires
non-trivial amount of work to implement HDFS clients and servers.
> This jira explores to delegate the responsibilities of the session and presentation layers
to the HTTP/2 protocol. Particularly, HTTP/2 handles connection multiplexing, QoS, authentication
and encryption, reducing the scope of DTP to the application layer only. By leveraging the
existing HTTP/2 library, it should simplify the implementation of both HDFS clients and servers.

This message was sent by Atlassian JIRA

View raw message