Return-Path: Delivered-To: apmail-hadoop-core-commits-archive@www.apache.org Received: (qmail 99805 invoked from network); 5 Feb 2008 19:20:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 5 Feb 2008 19:20:14 -0000 Received: (qmail 45302 invoked by uid 500); 5 Feb 2008 19:20:06 -0000 Delivered-To: apmail-hadoop-core-commits-archive@hadoop.apache.org Received: (qmail 45269 invoked by uid 500); 5 Feb 2008 19:20:06 -0000 Mailing-List: contact core-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-commits@hadoop.apache.org Received: (qmail 45256 invoked by uid 99); 5 Feb 2008 19:20:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2008 11:20:06 -0800 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.130] (HELO eos.apache.org) (140.211.11.130) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2008 19:19:37 +0000 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id BA978D2D5 for ; Tue, 5 Feb 2008 19:19:44 +0000 (GMT) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: Apache Wiki To: core-commits@hadoop.apache.org Date: Tue, 05 Feb 2008 19:19:44 -0000 Message-ID: <20080205191944.16591.53196@eos.apache.org> Subject: [Hadoop Wiki] Trivial Update of "Hadoop Upgrade" by Marc Harris X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The following page has been changed by Marc Harris: http://wiki.apache.org/hadoop/Hadoop_Upgrade The comment on the change is: Change – to - in command line options ------------------------------------------------------------------------------ == Instructions: == 1. Stop map-reduce cluster(s) [[BR]] {{{bin/stop-mapred.sh}}} [[BR]] and all client applications running on the DFS cluster. - 2. Run {{{fsck}}} command: [[BR]] {{{bin/hadoop fsck / -files –blocks –locations > dfs-v-old-fsck-1.log}}} [[BR]] Fix DFS to the point there are no errors. The resulting file will contain complete block map of the file system. [[BR]] Note. Redirecting the {{{fsck}}} output is recommend for large clusters in order to avoid time consuming output to stdout. + 2. Run {{{fsck}}} command: [[BR]] {{{bin/hadoop fsck / -files -blocks -locations > dfs-v-old-fsck-1.log}}} [[BR]] Fix DFS to the point there are no errors. The resulting file will contain complete block map of the file system. [[BR]] Note. Redirecting the {{{fsck}}} output is recommend for large clusters in order to avoid time consuming output to stdout. 3. Run {{{lsr}}} command: [[BR]] {{{bin/hadoop dfs -lsr / > dfs-v-old-lsr-1.log}}} [[BR]] The resulting file will contain complete namespace of the file system. 4. Run {{{report}}} command to create a list of data nodes participating in the cluster. [[BR]] {{{bin/hadoop dfsadmin -report > dfs-v-old-report-1.log}}} 5. Optionally, copy all or unrecoverable only data stored in DFS to a local file system or a backup instance of DFS. @@ -45, +45 @@ 15. Start DFS cluster. [[BR]] {{{bin/start-dfs.sh}}} 16. Run report command: [[BR]] {{{bin/hadoop dfsadmin -report > dfs-v-new-report-1.log}}} [[BR]] and compare with {{{dfs-v-old-report-1.log}}} to ensure all data nodes previously belonging to the cluster are up and running. 17. Run {{{lsr}}} command: [[BR]] {{{bin/hadoop dfs -lsr / > dfs-v-new-lsr-1.log}}} [[BR]] and compare with {{{dfs-v-old-lsr-1.log}}}. These files should be identical unless the format of {{{lsr}}} reporting or the data structures have changed in the new version. - 18. Run {{{fsck}}} command: [[BR]] {{{bin/hadoop fsck / -files –blocks –locations > dfs-v-new-fsck-1.log}}} [[BR]] and compare with {{{dfs-v-old-fsck-1.log}}}. These files should be identical, unless the {{{fsck}}} reporting format has changed in the new version. + 18. Run {{{fsck}}} command: [[BR]] {{{bin/hadoop fsck / -files -blocks -locations > dfs-v-new-fsck-1.log}}} [[BR]] and compare with {{{dfs-v-old-fsck-1.log}}}. These files should be identical, unless the {{{fsck}}} reporting format has changed in the new version. 19. Start map-reduce cluster [[BR]] {{{bin/start-mapred.sh}}} In case of failure the administrator should have the checkpoint files in order to be able to repeat the procedure from the appropriate point or to restart the old version of Hadoop. The {{{*.log}}} files should help in investigating what went wrong during the upgrade. @@ -57, +57 @@ 2. The '''safe mode''' implementation will further help to prevent name node from voluntary decisions on block deletion and replication. 3. A '''faster fsck''' is required. ''Currently {{{fsck}}} processes 1-2 TB per minute.'' 4. Hadoop should provide a '''backup solution''' as a stand alone application. - 5. Introduce an explicit '''–upgrade option''' for DFS (See below) and a related + 5. Introduce an explicit '''-upgrade option''' for DFS (See below) and a related 6. '''finalize upgrade''' command. == Shutdown command: == @@ -116, +116 @@ 1. Stop map-reduce cluster(s) and all client applications running on the DFS cluster. 2. Stop DFS using the shutdown command. 3. Install new version of Hadoop software. - 4. Start DFS cluster with –upgrade option. + 4. Start DFS cluster with -upgrade option. 5. Start map-reduce cluster. 6. Verify the components run properly and finalize the upgrade when convinced. This is done using the -finalizeUpgrade option to the hadoop dfsadmin command.