hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jelle Smet (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-9091) Allow daemon startup when at least 1 (or configurable) disk is in an OK state.
Date Mon, 26 Nov 2012 10:22:58 GMT
Jelle Smet created HADOOP-9091:
----------------------------------

             Summary: Allow daemon startup when at least 1 (or configurable) disk is in an
OK state.
                 Key: HADOOP-9091
                 URL: https://issues.apache.org/jira/browse/HADOOP-9091
             Project: Hadoop Common
          Issue Type: Improvement
          Components: fs
    Affects Versions: 0.20.2
            Reporter: Jelle Smet


The given example is if datanode disk definitions but should be applicable to all configuration
where a list of disks are provided.

I have defined multiple local disks defined for a datanode:
<property>
<name>dfs.data.dir</name>
<value>/data/01/dfs/dn,/data/02/dfs/dn,/data/03/dfs/dn,/data/04/dfs/dn,/data/05/dfs/dn,/data/06/dfs/dn</value>
<final>true</final>
</property>

When one of those disks breaks and is unmounted then the mountpoint (such as /data/03 in this
example) becomes a regular directory which doesn't have the valid permissions and possible
directory structure Hadoop is expecting.
When this situation happens, the datanode fails to restart because of this while actually
we have enough disks in an OK state to proceed.  The only way around this is to alter the
configuration and omit that specific disk configuration.

To my opinion, It would be more practical to let Hadoop daemons start when at least 1 disks/partition
in the provided list is in a usable state.  This prevents having to roll out custom configurations
for systems which have temporarily a disk (and therefor directory layout) missing.  This might
also be configurable that at least X partitions out of he available ones are in OK state.







--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message