hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Trivial Update of "Hadoop 0.14 Upgrade" by RaghuAngadi
Date Wed, 22 Aug 2007 01:01:36 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by RaghuAngadi:
http://wiki.apache.org/lucene-hadoop/Hadoop_0%2e14_Upgrade

------------------------------------------------------------------------------
  = Upgrade Guide for Hadoop-0.14 =
  
- This page describes upgrade information that is specific to Hadoop-0.14. The usual upgrade
described in [:Hadoop_Upgrade: Hadoop Upgrade page] still applies for Hadoop-0.14. 
+ This page describes upgrade information that is specific to Hadoop-0.14. The normal upgrade
described in [:Hadoop_Upgrade: Hadoop Upgrade page] still applies for Hadoop-0.14. 
  
  == Upgrade Path ==
  
@@ -36, +36 @@

  
  == Monitoring the Upgrade ==
  
- The cluster stays in ''safeMode'' until the upgrade is complete. HDFS webui is a good place
to check if safeMode is on or off. As always log files from ''namenode'' and ''datanode''
are useful when nothing else helps.
+ The cluster stays in ''safeMode'' until the upgrade is complete. HDFS webui is a good place
to check if safeMode is on or off. As always, log files from ''namenode'' and ''datanode''
are useful when nothing else helps.
  
  Once the cluster is started with {{{-upgrade}}} option, the simplest way to monitor the
upgrade is with '{{{dfsadmin -upgradeProgress status}}}' command. 
  
@@ -70, +70 @@

  }}} 
  
   * {{{Status = 78%}}} : This is a rough approximation of how much of upgrade is completed.
-  * {{{Block Level Stats}}} : Once the upgrade is started, Namenode iterates through all
the block to check how many of the blocks are upgrade. This information is useful on large
clusters where some datanodes may never complete upgrade of their blocks (discussed in later
sections).
+  * {{{Block Level Stats}}} : Once the upgrade starts, Namenode iterates through all the
block to check how many of the blocks are upgraded. This information is useful on large clusters
where some datanodes may never complete upgrade of their blocks (discussed in later sections).
-    * {{{Fully Upgraded}}} : Percentage of blocks, where the expected number of replicas
are upgraded. E.g. if a block has replication of 3, it is considered ''fully upgraded'' if
at least three datanodes that contain this blocks have completed their updating checksums.
+    * {{{Fully Upgraded}}} : Percentage of blocks, where the expected number of replicas
are upgraded. E.g. if a block has replication of 3, it is considered ''fully upgraded'' if
at least three datanodes that contain this blocks have finished upgrade of their blocks.
     * {{{Minimally Upgraded}}} : Similar to above, number of upgraded replicas is at least
{{{dfs.min.replication}}} (default 1) and is less than expected number of replicas.
     * {{{Under Upgraded}}} : number of upgraded replicas is less than {{{dfs.min.replication}}}.
     * {{{Un-upgraded}}} : blocks with zero upgraded replicas.
   * {{{Brief Datanode Status}}} : Each datanode reports its progress to the namenode during
the upgrade. This shows average of percent completion on all the datanodes. This also shows
how many datanodes have completed their upgrade. For the upgrade to proceed to next stage,
all the datanodes should report completion of their local upgrade.
  
- Note that in some cases, a few blocks might be ''over-replicated'' in such cases, upgrade
might proceed to next stage even if some of the datanodes do not complete their upgrade. If
{{{Fully Upgraded}}} is calculated to be 100%, namenode will proceed to next stage.
+ Note that in some cases, a few blocks might be ''over-replicated''. In such a case upgrade
might proceed to next stage even if some of the datanodes do not complete their upgrade. If
{{{Fully Upgraded}}} is calculated to be 100%, namenode will proceed to next stage even if
not all the datanodes have completed their upgrade.
  
  ==== Potential Problems during Second Stage ====
-  * ''The upgrade might seem to be stuck'' : Each datanode reports its progress once every
minute. If the percent completion does not change change even afeter a few minutes, some datanodes
might have some unexpected problems. Use {{{details}}} option with {{{-upgradeProgress}}}
command to check which datanodes seem stagnant. {{{
+  * ''The upgrade might seem to be stuck'' : Each datanode reports its progress once every
minute. If the percent completion does not change even afeter a few minutes, some datanodes
might have some unexpected problems. Use {{{details}}} option with {{{-upgradeProgress}}}
command to check which datanodes seem stagnant. {{{
  $ bin/hadoop dfsadmin -upgradeProgress details
  Distributed upgrade for version -6 is in progress. Status = 72%
  
@@ -101, +101 @@

                  192.168.0.24:50010        : 50 %         2044 u  1999 r  0 e
                  192.168.0.214:50010       : 100 %        4678 u  0 r     0 e
                  ...
- }}} You can run this command through '{{{grep -v "100 %"}}}' to find the nodes that have
not completed their upgrade. If the problem nodes can not be corrected, as a last resort you
can check ''Block Level Stats'' to see if the upgrade can be ''forced'' to next stage. E.g.
if 98% are fully-upgraded and 2% minimally-upgraded, then you can reasonably sure that at
least one copy of a block is upgraded. You can force next stage with {{{force}}} option :
{{{
+ }}} You can run this command through '{{{grep -v "100 %"}}}' to find the nodes that have
not completed their upgrade. If the problem nodes can not be corrected, as a last resort you
can check ''Block Level Stats'' to see if the upgrade can be ''forced'' to next stage. E.g.
if 98% are fully-upgraded and 2% are minimally-upgraded, then you can reasonably be sure that
at least one copy of a block is upgraded. You can force next stage with {{{force}}} option
: {{{
  $ bin/hadoop dfsadmin -upgradeProgress force
  Distributed upgrade for version -6 is in progress. Status = 90%
  
@@ -119, +119 @@

          can take longer than status implies.   
  }}} Note {{{Force Proceed is ON}}} in the status message.
  
- === Third Stage : Deleting {{{.crc}}} files ===
+ === Third Stage : Deleting .crc files ===
- Once the second stage is complete, Namenode reports 90% completiong. It does not have a
very good way of estimating time required for deleting the files. The ''status'' reports 90%
completion all through this stage. Later tests with larger number of files indicates that
it takes one hour to delete 2 million files on a rack server. The upgrade status report looks
like the following. {{{
+ Once the second stage is complete, Namenode reports 90% completion. It does not have a very
good way of estimating time required for deleting the files. The ''status'' reports 90% completion
all through this stage. Later tests with larger number of files indicates that it takes one
hour to delete 2 million files on a rack server. The upgrade status report looks like the
following. {{{
  $ bin/hadoop dfsadmin -upgradeProgress status
  Distributed upgrade for version -6 is in progress. Status = 90%
  
@@ -144, +144 @@

  
  === Memory requirements ===
  
- HDFS nodes do not require more memory during the upgrade than for normal operation before
the upgrade. We observed that Namenode might use 5-10% more memory (or more GC in JVM) during
the upgrade. If the namenode was operating at the edge of its memory limits during the upgrade,
it could potentially have some problems. At any time, cluster can be restarted and the HDFS
resumes the upgrade.
+ HDFS nodes do not require more memory during the upgrade than for normal operation before
the upgrade. We observed that Namenode might use 5-10% more memory (or more GC in JVM) during
the upgrade. If the namenode was operating at the edge of its memory limits before the upgrade,
it could potentially have some problems. At any time, cluster can be restarted and the HDFS
resumes the upgrade.
  
  === Restarting a cluster ===
  

Mime
View raw message