hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/MapReduce" by stack
Date Thu, 07 Feb 2008 00:59:39 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by stack:
http://wiki.apache.org/hadoop/Hbase/MapReduce

The comment on the change is:
Simplify

------------------------------------------------------------------------------
  = Hbase, MapReduce and the CLASSPATH =
  
- !MapReduce jobs deployed to a mapreduce cluster do not usually have access to the configuration
under ''$HBASE_CONF_DIR'' nor to hbase classes.
+ !MapReduce jobs deployed to a mapreduce cluster do not by default have access to the hbase
configuration under ''$HBASE_CONF_DIR'' nor to hbase classes.
  
- Any hbase particular configuration not hard-coded into the job jar classes -- e.g. the address
of the target hbase master -- that is needed by running maps and/or reduces needs to be either
included explicitly in the job jar, by jarring an appropriately configured ''hbase-site.xml''
into a conf subdirectory, or by adding an ''hbase-site.xml'' under ''$HADOOP_HOME/conf'' and
copying it across the mapreduce cluster.  The same holds true for any hbase classes referenced
by the mapreduce job jar.  By default the hbase classes are not available on the general mapreduce
''CLASSPATH''.  To add them, you have a couple of options. Either include the hadoop-X.X.X-hbase.jar
in the job jar under the lib subdirectory or copy the hadoop-X.X.X-hbase.jar to $HADOOP_HOME/lib
and copy it across the cluster.
- 
- But the cleanest means of adding hbase configuration and classes to the cluster CLASSPATH
is by uncommenting ''HADOOP_CLASSPATH'' in ''$HADOOP_HOME/conf/hadoop-env.sh'' and adding
the path to the hbase jar and ''conf'' directory.  Then copy the amended configuration across
the cluster.  You'll need to restart the mapreduce cluster if you want it to notice the new
configuration.
+ You could add ''hbase-site.xml'' to $HADOOP_HOME/conf and add hbase.jar to the $HADOOP_HOME/lib
and copy these changes across your cluster but he cleanest means of adding hbase configuration
and classes to the cluster CLASSPATH is by uncommenting ''HADOOP_CLASSPATH'' in ''$HADOOP_HOME/conf/hadoop-env.sh''
and adding the path to the hbase jar and ''$HBASE_CONF_DIR'' directory.  Then copy the amended
configuration across the cluster.  You'll need to restart the mapreduce cluster if you want
it to notice the new configuration.
  
  For example, here is how you would amend ''hadoop-env.sh'' adding hbase classes and the
!PerformanceEvaluation class from hbase test classes to the hadoop ''CLASSPATH'':
  
@@ -14, +12 @@

  # export HADOOP_CLASSPATH=
  export HADOOP_CLASSPATH=$HBASE_HOME/build/test:$HBASE_HOME/build/hadoop-0.15.0-dev-hbase.jar}}}
  
- (Expand $HBASE_HOME appropriately in the in accordance with your local environment)
+ Expand $HBASE_HOME appropriately in the in accordance with your local environment
  
  And then, this is how you would run the PerformanceEvaluation MR job to put up 4 clients:
  
- {{{ > $HADOOP_HOME/bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite
4 }}}
+ {{{ > $HADOOP_HOME/bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite
4
+ }}}
  
- (The PerformanceEvaluation class wil be found on the CLASSPATH because you added $HBASE_HOME/build/test
to HADOOP_CLASSPATH)
+ The PerformanceEvaluation class wil be found on the CLASSPATH because you added $HBASE_HOME/build/test
to HADOOP_CLASSPATH
  
  = Hbase as MapReduce job data source and sink =
  

Mime
View raw message