hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-270) DFS Upgrade should process dfs.data.dirs in parallel
Date Thu, 07 Oct 2010 00:40:31 GMT

    [ https://issues.apache.org/jira/browse/HDFS-270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918747#action_12918747

Matt Foley commented on HDFS-270:

Our datanodes take 5-15 minutes per volume to upgrade, and with four disks per node, done
serially, this is a 45 minute or so wait before the NN starts getting registrations.  In our
environment the majority of restarts are for upgrades, so this is important operationally.

I'll post a proposal in a few days to parallelize this, and possibly speed it up.

> DFS Upgrade should process dfs.data.dirs in parallel
> ----------------------------------------------------
>                 Key: HDFS-270
>                 URL: https://issues.apache.org/jira/browse/HDFS-270
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Stu Hood
>            Assignee: Matt Foley
>            Priority: Minor
> I just upgraded from 0.14.2 to 0.15.0, and things went very smoothly, if a little slowly.
> The main reason the upgrade took so long was the block upgrades on the datanodes. Each
of our datanodes has 3 drives listed for the dfs.data.dir parameter. From looking at the logs,
it is fairly clear that the upgrade procedure does not attempt to upgrade all listed dfs.data.dir's
in parallel.
> I think even if all of your dfs.data.dir's are on the same physical device, there would
still be an advantage to performing the upgrade process in parallel. The less downtime, the
better: especially if it is potentially 20 minutes versus 60 minutes.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message