ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Fernandez (JIRA)" <>
Subject [jira] [Updated] (AMBARI-12252) Prevent datanode from creating an HDFS datadir when drive becomes unmounted
Date Thu, 02 Jul 2015 04:52:04 GMT


Alejandro Fernandez updated AMBARI-12252:
    Attachment: AMBARI-12252.patch

> Prevent datanode from creating an HDFS datadir when drive becomes unmounted
> ---------------------------------------------------------------------------
>                 Key: AMBARI-12252
>                 URL:
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-agent
>    Affects Versions: 1.7.0
>            Reporter: Alejandro Fernandez
>            Assignee: Alejandro Fernandez
>            Priority: Critical
>             Fix For: 2.1.0
>         Attachments: AMBARI-12252.branch-2.1.patch, AMBARI-12252.patch
> This is related to AMBARI-7506
> Ambari keeps track of a file, /etc/hadoop/conf/dfs_data_dir_mount.hist 
> that contains a mapping of HDFS data dirs to the last known mount point.
> This is used to detect when a data dir becomes unmounted, in order to prevent HDFS from
writing to the root partition.
> Consider the example of a data node configured with these volumes: 
> /dev/sda -> / 
> /dev/sdb -> /grid/0
> /dev/sdc -> /grid/1
> /dev/sdd -> /grid/2
> Typically, each /grid/#/ directory contains a data folder.
> Today, if a data directory becomes unmounted, then the directory will not exist and Ambari
will not create it automatically. Ambari will simply log a warning, and update its cache with
the new mount point, which is /  ; that is the underlying bug.
> If hdfs-site contains dfs.datanode.failed.volumes.tolerated with a value > 0, then
DataNode will tolerate the failure, otherwise, the DataNode will die.
> Because Ambari will already have "/" in its cache file, the fact that it used to be mounted
in a non-root drive is lost, so next time DataNode is restarted, Ambari will create the data
dir which is now mounted on the root partition; this is really bad because HDFS will now fill
up the root drive.
> The admin can still remount the partition, but then needs to restart DataNode so Ambari
can update its cache.

This message was sent by Atlassian JIRA

View raw message