hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9924) [umbrella] Asynchronous HDFS Access
Date Tue, 14 Jun 2016 19:01:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330173#comment-15330173

stack commented on HDFS-9924:

bq. There are multiple comments from both sides indicating that CompletableFuture is the ideal
option for 3.x.

[~arpiagariu] Please leave off concluding a discussion that is still ongoing (CF is not 'ideal'
and is not a given). It doesn't help sir.

bq. You mean just like we recently added 'avoid local nodes' because another downstream component
wanted to try it? 

You misrepresent, again. HBase ran for years with a workaround while waiting on the behavior
to show up in HDFS; i.e. the hbase project did not have an 'interest' in 'avoid local nodes';
they required this behavior of the filesystem and ran with a suboptimal hack until it showed

In this case all we have is 'interest' and requests for technical justification go unanswered.

bq. The Hive engineers think they can make it work for them and there was a compromise proposed
to introduce the API as unstable.

I'm interested in how Hive will do async w/ only a Future and in how this suboptimal API in
particular will solve their issue (is it described anywhere?). In my experience, a bunch of
rigging (threads) for polling, rather than notification, is required when all you have is
a Future to work with.

> [umbrella] Asynchronous HDFS Access
> -----------------------------------
>                 Key: HDFS-9924
>                 URL: https://issues.apache.org/jira/browse/HDFS-9924
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Xiaobing Zhou
>         Attachments: AsyncHdfs20160510.pdf
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked until the
method returns.  It is very slow if a client makes a large number of independent calls in
a single thread since each call has to wait until the previous call is finished.  It is inefficient
if a client needs to create a large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is not blocked.
 The methods in the new API immediately return a Java Future object.  The return value can
be obtained by the usual Future.get() method.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message