hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "FAQ" by SomeOtherAccount
Date Thu, 09 Sep 2010 17:20:25 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "FAQ" page has been changed by SomeOtherAccount.
http://wiki.apache.org/hadoop/FAQ?action=diff&rev1=72&rev2=73

--------------------------------------------------

  
  <<BR>> <<Anchor(3)>> '''3. [[#A3|How well does Hadoop scale?]]'''
  
- Hadoop has been demonstrated on clusters of up to 2000 nodes.  Sort performance on 900 nodes
is good (sorting 9TB of data on 900 nodes takes around 1.8 hours) and [[attachment:sort900-20080115.png|improving]]
using these non-default configuration values:
+ Hadoop has been demonstrated on clusters of up to 4000 nodes.  Sort performance on 900 nodes
is good (sorting 9TB of data on 900 nodes takes around 1.8 hours) and [[attachment:sort900-20080115.png|improving]]
using these non-default configuration values:
  
   * `dfs.block.size = 134217728`
   * `dfs.namenode.handler.count = 40`
@@ -276, +276 @@

  
  It appears that DatanodeID.getHost() is the standard place to retrieve this name, and the
machineName variable, populated in DataNode.java\#startDataNode, is where the name is first
set. The first method attempted is to get "slave.host.name" from the configuration; if that
is not available, DNS.getDefaultHost is used instead.
  
+ <<BR>> <<Anchor(31)>> '''31. [[#A31|On an individual data node,
how do you balance the blocks on the disk?]]'''
+ 
+ Hadoop currently does not have a method by which to do this automatically.  To do this manually:
+ 
+  1. Take down the HDFS
+  2. Use the UNIX mv command to move the individual blocks and meta pairs from one directory
to another on each host
+  3. Restart the HDFS
+ 

Mime
View raw message