hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9924) [umbrella] Asynchronous HDFS Access
Date Wed, 15 Jun 2016 21:26:09 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332613#comment-15332613

Andrew Wang commented on HDFS-9924:

bq. Which part of the bylaws are you talking about?

Code Change
A change made to a codebase of the project and committed by a committer. This includes source
code, documentation, website content, etc.
Consensus approval of active committers, but with a minimum of one +1. The code can be committed
after the first +1, unless the code change represents a merge from a branch, in which case
three +1s are required.

We've been discussing pros and cons and I'm glad there's now a better understanding of the
Hive usecase, but this discussion can all be done with the code sitting on a feature branch.
There is a callback API proposal for branch-2+trunk (Deferred) which would make everyone happy.
It's easy to integrate the feature branch down the road.

So, let's just put it on a branch, and merge it for a later 2.x/3.x release. The opposition
is not to the general idea of an async API in branch-2, it's having different APIs in branch-2
and trunk, and particularly an API that don't support callbacks. There's no urgency about
getting this into 2.8. It's not a regression. And, if this were a super critical performance
issue for Hive, they would have tried a threadpool already. A threadpool would also work with
any version of HDFS, not just 2.8.

Per the bylaws, code integration is done based on consensus. I appreciate the added context
from this continuing conversation, but even with the fuller understanding of the Hive usecase,
multiple committers still want this effort continued on a branch rather than in the release
branches. So, since there is still not consensus, the code should be backed out and continued
on a feature branch.

Please respect the bylaws and the expressed desires of the other committers in the project.
If no one beats me to it, my plan is to move the commits to the HDFS-9924 branch EOD tomorrow.

> [umbrella] Asynchronous HDFS Access
> -----------------------------------
>                 Key: HDFS-9924
>                 URL: https://issues.apache.org/jira/browse/HDFS-9924
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Xiaobing Zhou
>         Attachments: AsyncHdfs20160510.pdf
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked until the
method returns.  It is very slow if a client makes a large number of independent calls in
a single thread since each call has to wait until the previous call is finished.  It is inefficient
if a client needs to create a large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is not blocked.
 The methods in the new API immediately return a Java Future object.  The return value can
be obtained by the usual Future.get() method.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message