hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6121) Support of "mount" onto HDFS directories
Date Thu, 20 Mar 2014 16:48:43 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13941936#comment-13941936

Chris Nauroth commented on HDFS-6121:

bq. This leaves room for random I/O between different threads.

I seem to recall prior discussion of a "global I/O scheduler" idea.  This would be something
aware of all such threads (i.e. all of the map tasks on the same node trying to scan different
HDFS blocks on the same physical disk), and it would try to maximize overall throughput by
keeping one thread scanning forward while blocking the others.  Could that potentially be
a different solution to the problem?

Unfortunately, I can't find any prior jiras or mailing list discussions related to this idea,
so I don't know what details have been discussed already.  There are of course potential negative
side effects to consider, like unpredictable latency.  I think this never got prioritized,
because the problem hasn't caused a large impact in practice.  (Doesn't mean we can't do it,
just that it hasn't been prioritized.)

> Support of "mount" onto HDFS directories
> ----------------------------------------
>                 Key: HDFS-6121
>                 URL: https://issues.apache.org/jira/browse/HDFS-6121
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Yan
> Currently, HDFS configuration can only create HDFS on one or several existing local file
system directories. This pretty much abstracts physical disk drives away from HDFS users.
> While it may provide conveniences in data movement/manipulation/management/formatting,
it could deprive users a way to access physical disks in a more directly controlled manner.
> For instance, a multi-threaded server may wish to access its disk blocks sequentially
per thread for fear of random I/O otherwise. If the cluster boxes have multiple physical disk
drives, and the server load is pretty much I/O-bound, then it will be quite reasonable to
hope for disk performance typical of sequential I/O. Disk read-ahead and/or buffering at various
layers may alleviate the problem to some degree, but it couldn't totally eliminate it. This
could hurt hard performance of workloads than need to scan data.
> Map/Reduce may experience the same problem as well.
> For instance, HBase region servers may wish to scan disk data for each region in a sequential
way, again, to avoid random I/O. HBase incapability in this regard aside, one major obstacle
is with HDFS's incapability to specify mappings of local directories to HDFS directories.
Specifically, the "dfs.data.dir" configuration setting only allows for the mapping from one
or multiple local directories to the HDFS root directory. In the case of data nodes of multiple
disk drives mounted as multiple local file system directories per node, the HDFS data will
be spread on all disk drives in a pretty random manner, potentially resulting random I/O from
a multi-threaded server reading multiple data blocks from each thread.
> A seemingly simple enhancement is an introduction of mappings from one or multiple local
FS directories to a single HDFS directory, plus necessary sanity checks, replication policies,
advices of best practices, ..., etc, of course. Note that this should be an one-to-one or
many-to-one mapping from local to HDFS directories. The other way around, though probably
feasible, won't serve our purpose at all. This is similar to the mounting of different disks
onto different local FS directories, and will give the users an option to place and access
their data in a more controlled and efficient way. 
> Conceptually this option will allow for local physical data partition in a distributed
environment for application data on HDFS.

This message was sent by Atlassian JIRA

View raw message