hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiaoyu Yao (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-9924) [umbrella] Asynchronous HDFS Access
Date Wed, 01 Jun 2016 18:31:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310837#comment-15310837
] 

Xiaoyu Yao edited comment on HDFS-9924 at 6/1/16 6:31 PM:
----------------------------------------------------------

[~daryn], thanks for the valuable feedback. [~kihwal] also mentioned similar issue [here|https://issues.apache.org/jira/browse/HADOOP-12916?focusedCommentId=15277342&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15277342].
But I wasn't able to get clarification of it. The FSN/FSD locking issue is a very good point.
I tried to find some metrics/logs about it but there was not any. I will open a separate ticket
to add more metrics and WARN/DEBUG logs for long locking operations on namenode similar to
what we have for slow write/network WARN/metrics on datanode.  

As you mentioned above, the priority level is assigned by scheduler. As part of HADOOP-12916,
we separate scheduler from call queue and make it pluggable so that priority assignment can
be customized as appropriate for different workloads. For the mixed write intensive and read
workload example, I agree that the DecayedRpcScheduler that uses call rate to determine priority
may not be the good choice. We have thought of adding a different scheduler that combines
the weight of RPC call and its rate. But it is tricky to assign weight. For example,  getContentSummary
on a directory with millions of files/dirs and a directory with a few files/dirs won't have
the same impact on NN. 

Backoff based on response time allows all users to stop overloading namenode when the high
priority RPC calls experience longer than normal end to end delay. User2/User3/User4 (low
priority based on call rate) will have much wider response time threshold for backing off.
In this case, User 1 will be backed off first by breaking the relative smaller response time
threshold and get namenode out of the state that other users can not use the namenode "fairly".


We are also proposing to have a scheduler that offers better namenode resource management
via YARN integration on HADOOP-13128. I would appreciate if you can share your thoughts and
comments on the proposal there as well. Thanks!



was (Author: xyao):
[~daryn], thanks for the valuable feedback. @Kihwal Lee also mentioned similar issue [here|https://issues.apache.org/jira/browse/HADOOP-12916?focusedCommentId=15277342&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15277342].
But I wasn't able to get clarification of it. The FSN/FSD locking issue is a very good point.
I tried to find some metrics/logs about it but there was not any. I will open a separate ticket
to add more metrics and WARN/DEBUG logs for long locking operations on namenode similar to
what we have for slow write/network WARN/metrics on datanode.  

As you mentioned above, the priority level is assigned by scheduler. As part of HADOOP-12916,
we separate scheduler from call queue and make it pluggable so that priority assignment can
be customized as appropriate for different workloads. For the mixed write intensive and read
workload example, I agree that the DecayedRpcScheduler that uses call rate to determine priority
may not be the good choice. We have thought of adding a different scheduler that combines
the weight of RPC call and its rate. But it is tricky to assign weight. For example,  getContentSummary
on a directory with millions of files/dirs and a directory with a few files/dirs won't have
the same impact on NN. 

Backoff based on response time allows all users to stop overloading namenode when the high
priority RPC calls experience longer than normal end to end delay. User2/User3/User4 (low
priority based on call rate) will have much wider response time threshold for backing off.
In this case, User 1 will be backed off first by breaking the relative smaller response time
threshold and get namenode out of the state that other users can not use the namenode "fairly".


We are also proposing to have a scheduler that offers better namenode resource management
via YARN integration on HADOOP-13128. I would appreciate if you can share your thoughts and
comments on the proposal there as well. Thanks!


> [umbrella] Asynchronous HDFS Access
> -----------------------------------
>
>                 Key: HDFS-9924
>                 URL: https://issues.apache.org/jira/browse/HDFS-9924
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Xiaobing Zhou
>         Attachments: AsyncHdfs20160510.pdf
>
>
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked until the
method returns.  It is very slow if a client makes a large number of independent calls in
a single thread since each call has to wait until the previous call is finished.  It is inefficient
if a client needs to create a large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is not blocked.
 The methods in the new API immediately return a Java Future object.  The return value can
be obtained by the usual Future.get() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message