hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6121) Support of "mount" onto HDFS directories
Date Thu, 20 Mar 2014 01:08:44 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13941256#comment-13941256

Yan commented on HDFS-6121:

Thanks to Andrew and Chris for their comments.

It seems to me this proposal is orthogonal to "heterogeneous storage" in HDFS-2832, although
I agree that mixing the two in current scheme might lead to some difficulty and confusion
in terms of configuration specification and understanding of the full picture. A 2-dimensional
problem is almost always much more difficult to grasp than a 1-dimensional problem.

Conceptually, the distinction is fairly clear: "heterogeneous storage" tries to address different
physical characteristics of different types of storage media; while this proposal tries to
address I/O contention on each physical disk even though they are of the same type and even
identical. With this distinction in mind, how to make the configuration and use of the two,
particularly in combo, easy and clear is a secondary question, I believe.

So the focal point is this: will this lead to random I/O or not? Note that large HDFS block
size is not much relevant here, because I/O requests as viewed by disk drivers are in units
of disk block sizes which are typically much smaller. In other words, HDFS block is not the
I/O request unit as taken by disks, and the I/O calls as made by HDFS clients are not atomic
to disks. This leaves room for random I/O between different threads. In reality, it may or
may not show up in an emphatic way. But when it occurs, it could be very bad. 

It'd be interesting to see the Impala's detail stats, as mentioned by Andrew, along with its
execution characteristics, when 100% disk utility was observed.

> Support of "mount" onto HDFS directories
> ----------------------------------------
>                 Key: HDFS-6121
>                 URL: https://issues.apache.org/jira/browse/HDFS-6121
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Yan
> Currently, HDFS configuration can only create HDFS on one or several existing local file
system directories. This pretty much abstracts physical disk drives away from HDFS users.
> While it may provide conveniences in data movement/manipulation/management/formatting,
it could deprive users a way to access physical disks in a more directly controlled manner.
> For instance, a multi-threaded server may wish to access its disk blocks sequentially
per thread for fear of random I/O otherwise. If the cluster boxes have multiple physical disk
drives, and the server load is pretty much I/O-bound, then it will be quite reasonable to
hope for disk performance typical of sequential I/O. Disk read-ahead and/or buffering at various
layers may alleviate the problem to some degree, but it couldn't totally eliminate it. This
could hurt hard performance of workloads than need to scan data.
> Map/Reduce may experience the same problem as well.
> For instance, HBase region servers may wish to scan disk data for each region in a sequential
way, again, to avoid random I/O. HBase incapability in this regard aside, one major obstacle
is with HDFS's incapability to specify mappings of local directories to HDFS directories.
Specifically, the "dfs.data.dir" configuration setting only allows for the mapping from one
or multiple local directories to the HDFS root directory. In the case of data nodes of multiple
disk drives mounted as multiple local file system directories per node, the HDFS data will
be spread on all disk drives in a pretty random manner, potentially resulting random I/O from
a multi-threaded server reading multiple data blocks from each thread.
> A seemingly simple enhancement is an introduction of mappings from one or multiple local
FS directories to a single HDFS directory, plus necessary sanity checks, replication policies,
advices of best practices, ..., etc, of course. Note that this should be an one-to-one or
many-to-one mapping from local to HDFS directories. The other way around, though probably
feasible, won't serve our purpose at all. This is similar to the mounting of different disks
onto different local FS directories, and will give the users an option to place and access
their data in a more controlled and efficient way. 
> Conceptually this option will allow for local physical data partition in a distributed
environment for application data on HDFS.

This message was sent by Atlassian JIRA

View raw message