hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "PerformanceTuning" by SteveLoughran
Date Wed, 24 Jun 2009 16:16:31 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by SteveLoughran:
http://wiki.apache.org/hadoop/PerformanceTuning

The comment on the change is:
Broaden to more than just HBase, mention the jvm reuse option

------------------------------------------------------------------------------
+ == NameNode Performance Tips ==
+ 
+  * Lots of RAM; you don't want the Namenode JVM to be swapping.
+ 
+ 
+ 
+ == MapReduce Performance ==
+ 
+ You can save a lot of time by enabling JVM re-use on MR jobs. In the JobTracker, or the
Job itself, set {{{mapred.job.reuse.jvm.num.tasks}}} to the number of times to reuse a JVM
''for the same map or reduce transform''  -or to -1 to reuse without limits. This reduces
JVM startup/teardown times. 
+ 
+ The more copies of a block there is, the more places there are to schedule work on the same
host as the block, so eliminating the need to copy the block over the network. Set the {{block.replication.factor}}
on files to be more than the default (usually 3) if you want to make it accessible in more
spaces. 
+ 
- == Performance tips ==
+ == HBase Performance tips ==
  
   * Use compression, see [UsingLzoCompression]
   * Ram, ram, ram.  Don't starve HBase.
   * More CPUs is important, as you will see in the next section
-  * Use a 64 bit platform, and a 64 bit JVM.
+  * Use a 64-bit platform, and a 64-bit JVM.
   * Your clients might need tuning: [http://ryantwopointoh.blogspot.com/2009/01/performance-of-hbase-importing.html]
-  * Make sure that java implies -server on your machines, or else you will have to explicitly
enable it.
+  * Make sure that the command {{{java}}} implies {{{-server}}} on your machines, or else
you will have to explicitly enable it.
  
- == JVM and GC ==
+ == HBase JVM and GC ==
  
  HBase is memory intensive, and using the default GC you can see long pauses in all threads.
 With the addition of ZooKeeper this can cause false errors as ZooKeeper and the HBase master
thinks a regionserver has died.  
  
@@ -78, +90 @@

  export HBASE_OPTS="-XX:NewSize=6m -XX:MaxNewSize=6m <cms options from above> <gc
logging options from above>"
  }}}
  
- 
- 

Mime
View raw message