hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2
Date Tue, 16 Jun 2015 20:22:02 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14588688#comment-14588688

stack commented on HDFS-7966:

bq. The code path is completely separated so it should be few risks in terms of destabilizing
the trunk.

My concern is not so much destabilization. My concern is a bunch of new code that may never
get used.

Reviewing the patches so far, it seems like you fellas are working it out as you go. Nothing
wrong with that. It just seems like something better done in a branch than in mainline.

The justification for this work is a little nebulous. It has it that "[DTP on HTTP/2] ...should
simplify the implementation of both HDFS clients and servers." Apart from the fact that DTP
is but a severe subset, the 'easy' part, of what an alternative client/server would have to
implement, what if HTTP/2 complicates rather than simplifies new clients and servers?  Better
to figure this, and 'fit criteria' that prove it simplifies, on a branch I'd say.

Also, why would folks move to using this new transport? Will it be more performant than current
DTP? (I'd guess not... given HTTP/2 does a bunch of 'extras' and going by the PoC done over
in HBase) When complete, we might have a bunch of new code that is slower than what is currently
there and that folks are wary to try given it is 'new'. This state of affairs could go on
such that the code is never exercised.

I am suggesting a branch because there you work out implementation, perf characteristics,
and answers to any questions such as the sample posed above and come merge time, you will
have a more solid story to tell.

> New Data Transfer Protocol via HTTP/2
> -------------------------------------
>                 Key: HDFS-7966
>                 URL: https://issues.apache.org/jira/browse/HDFS-7966
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Haohui Mai
>            Assignee: Qianqian Shi
>              Labels: gsoc, gsoc2015, mentor
>         Attachments: GSoC2015_Proposal.pdf
> The current Data Transfer Protocol (DTP) implements a rich set of features that span
across multiple layers, including:
> * Connection pooling and authentication (session layer)
> * Encryption (presentation layer)
> * Data writing pipeline (application layer)
> All these features are HDFS-specific and defined by implementation. As a result it requires
non-trivial amount of work to implement HDFS clients and servers.
> This jira explores to delegate the responsibilities of the session and presentation layers
to the HTTP/2 protocol. Particularly, HTTP/2 handles connection multiplexing, QoS, authentication
and encryption, reducing the scope of DTP to the application layer only. By leveraging the
existing HTTP/2 library, it should simplify the implementation of both HDFS clients and servers.

This message was sent by Atlassian JIRA

View raw message