hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo Nicholas Sze (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9924) [umbrella] Asynchronous HDFS Access
Date Wed, 11 May 2016 19:47:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280686#comment-15280686

Tsz Wo Nicholas Sze commented on HDFS-9924:

> With regard to error handling, why not handle all errors as exceptions thrown from Future#get?

For some case like network connection errors, if we do not throw exception until Future#get,
the client could summit a large number of calls and the catch a lot of exceptions in Future#get.
 It is fail-fast if the client catch an exception in the first async call.

> Does the Future#get callback get made without holding any locks? ...

Yes, it does not holding any locks.

> It seems concerning that we would have to make such a large change to the synchronous
DistributedFileSystem code. ...

I agree.

> Blocking the client seems like it could be problematic for code which expects to be asynchronous.
There should be an option to throw an exception in this case.

Throwing exception is indeed better.

> I also think that we could maintain a queue of async calls that we have not submitted
to the IPC layer yet, to avoid being limited by issues at the IPC layer.

This is a good idea although it may not be easy to implement.  Will check that.

> Given that Hadoop 3.x will probably be Java 8 (based on the mailing list discussion),
why not just make the async API use jdk8's CompletableFuture from day 1, rather than hacking
it in later?

Because Java 7 does not support it.  We would like to have a larger audience for the new async

Will revise the design doc.  Thanks for the comments.

> [umbrella] Asynchronous HDFS Access
> -----------------------------------
>                 Key: HDFS-9924
>                 URL: https://issues.apache.org/jira/browse/HDFS-9924
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Xiaobing Zhou
>         Attachments: AsyncHdfs20160510.pdf
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked until the
method returns.  It is very slow if a client makes a large number of independent calls in
a single thread since each call has to wait until the previous call is finished.  It is inefficient
if a client needs to create a large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is not blocked.
 The methods in the new API immediately return a Java Future object.  The return value can
be obtained by the usual Future.get() method.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message