hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiaobing Zhou (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9924) [umbrella] Asynchronous HDFS Access
Date Wed, 15 Jun 2016 17:51:10 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332191#comment-15332191

Xiaobing Zhou commented on HDFS-9924:

With your solution, you will block on request 1 for a long time before resubmit the failed
2-99 request.

This is a inherent defect of lacking the support of callback.
And a better solution is, sorry, but again, using multiple threads
With a thread pool and CompletionService, you can (sometimes) get the failed request first.
You just had an extreme example trying to establish cause and effect. If it's really the case,
why not to resort to Future#IsDone() or call Future#get(long timeout, TimeUnit unit) with
neglectable timeout? You don't have to be blocked to send failed requests earlier. 

In addition, the RPC layer is async as is, Connection#receiveRpcResponse is run by a thread
(i.e. Connection extends Thread) to actively buffer the final result into Client#Call as long
as it's available on socket. As a result, The final result is already available in Client#Call
most often. You will not be experiencing the block subject to Future#get unless the result
is not returned by server.

Thank you anyway for the comments.

> [umbrella] Asynchronous HDFS Access
> -----------------------------------
>                 Key: HDFS-9924
>                 URL: https://issues.apache.org/jira/browse/HDFS-9924
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Xiaobing Zhou
>         Attachments: AsyncHdfs20160510.pdf
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked until the
method returns.  It is very slow if a client makes a large number of independent calls in
a single thread since each call has to wait until the previous call is finished.  It is inefficient
if a client needs to create a large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is not blocked.
 The methods in the new API immediately return a Java Future object.  The return value can
be obtained by the usual Future.get() method.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message