hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Trivial Update of "Hadoop 0.14 Upgrade" by RaghuAngadi
Date Tue, 21 Aug 2007 23:09:16 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by RaghuAngadi:
http://wiki.apache.org/lucene-hadoop/Hadoop_0%2e14_Upgrade

------------------------------------------------------------------------------
  = Upgrade Guide for Hadoop-0.14 =
  
  This page describes upgrade information that is specific to Hadoop-0.14. The usual upgrade
described in [:Hadoop_Upgrade: Hadoop Upgrade page] still applies for Hadoop-0.14. 
+ 
+ == Upgrade Path ==
+ 
+ We have tested upgrading from 0.12 to 0.14 and 0.13.1 to 0.14. However we recommend upgrading
to 0.13.1 before this upgrade. Hadoop-0.13 adds an important upgrade related feature that
is very useful for upgrades like the one described here. While upgrading from 0.13, if something
badly goes wrong, you can always ''rollback'' to pre-upgrade state by installing 0.13 again.
See [:Hadoop Upgrade: Hadoop Upgrade page] for more information. Upgrading from 0.11 and earlier
version are very similar to upgrading from 0.12, but these cases are not extensively tested.
  
  == Brief Upgrade Procedure ==
  
@@ -21, +25 @@

  
  == Block CRC Upgrade ==
  
- Hadoop-0.14 maintains checksums for HDFS data differently than earlier versions. Before
Hadoop-0.14, checksum for a file {{{f.txt}}} is stored in another HDFS file {{{.f.txt.crc}}}.
In Hadoop-0.14, there are no such ''shadow'' checksum files. In stead, checksum is stored
with each ''block'' of data at the ''datanode''. [http://issues.apache.org/jira/browse/HADOOP-1134
HADOOP-1134] describes this feature in great details. In order to migrate to the new structure,
each datanode reads the checksum data from {{{.crc}}} files in HDFS for each of its blocks
and stores the the checksum next to the block in local filesystem.
+ Hadoop-0.14 maintains checksums for HDFS data differently than earlier versions. Before
Hadoop-0.14, checksum for a file {{{f.txt}}} is stored in another HDFS file {{{.f.txt.crc}}}.
In Hadoop-0.14, there are no such ''shadow'' checksum files. In stead, checksum is stored
with each ''block'' of data at datanodes. [http://issues.apache.org/jira/browse/HADOOP-1134
HADOOP-1134] describes this feature in great detail. In order to migrate to the new structure,
each datanode reads checksum data from {{{.crc}}} files in HDFS for each of its blocks and
stores the the checksum next to the block in local filesystem.
  
- Depending on number of blocks and number of files in HDFS, upgrade can take anywhere from
a few minutes to a few hours.
+ Depending on number of blocks and number of files in HDFS, upgrade can take anywhere from
a few minutes to few hours.
  
- There are three stages in this upgrade :
+ There are three stages in Block CRC upgrade :
   1. '''Safe Mode''' : Similar to normal restart of the cluster, namenode waits for datanodes
in the cluster to report their blocks. The cluster may wait in the state for a long time if
some of the datanodes do not report their blocks. 
   1. '''Datanode Upgrade''' : Once the most of the blocks are reported, namenode asks the
registered datanodes to start their local upgrade. Namenode waits for for ''all'' the datanodes
to complete their upgrade.
   1. '''Deleting {{{.crc}}} files''' : Namenode deletes {{{.crc}}} files that were previously
used for storing checksum.

Mime
View raw message