hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jerry...@apache.org
Subject hbase git commit: HBASE-13251 Correct HBase, MapReduce, and the CLASSPATH section in HBase Ref Guide (li xiang)
Date Wed, 06 May 2015 04:26:34 GMT
Repository: hbase
Updated Branches:
  refs/heads/master 2e132db85 -> 664b2e4f1


HBASE-13251 Correct HBase, MapReduce, and the CLASSPATH section in HBase Ref Guide (li xiang)


Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/664b2e4f
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/664b2e4f
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/664b2e4f

Branch: refs/heads/master
Commit: 664b2e4f11a06af2bc6d4876a3d6ed270b28e898
Parents: 2e132db
Author: Jerry He <jerryjch@apache.org>
Authored: Tue May 5 21:25:06 2015 -0700
Committer: Jerry He <jerryjch@apache.org>
Committed: Tue May 5 21:25:06 2015 -0700

----------------------------------------------------------------------
 .../apache/hadoop/hbase/util/ByteStringer.java  |  2 +-
 src/main/asciidoc/_chapters/mapreduce.adoc      | 27 ++++++++++++++------
 2 files changed, 20 insertions(+), 9 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hbase/blob/664b2e4f/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java
----------------------------------------------------------------------
diff --git a/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java b/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java
index 5b10b83..afa9297 100644
--- a/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java
+++ b/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java
@@ -25,7 +25,7 @@ import com.google.protobuf.ByteString;
 import com.google.protobuf.HBaseZeroCopyByteString;
 
 /**
- * Hack to workaround HBASE-1304 issue that keeps bubbling up when a mapreduce context.
+ * Hack to workaround HBASE-10304 issue that keeps bubbling up when a mapreduce context.
  */
 @InterfaceAudience.Private
 public class ByteStringer {

http://git-wip-us.apache.org/repos/asf/hbase/blob/664b2e4f/src/main/asciidoc/_chapters/mapreduce.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/mapreduce.adoc b/src/main/asciidoc/_chapters/mapreduce.adoc
index a008a4f..2a42af2 100644
--- a/src/main/asciidoc/_chapters/mapreduce.adoc
+++ b/src/main/asciidoc/_chapters/mapreduce.adoc
@@ -51,27 +51,38 @@ In the notes below, we refer to o.a.h.h.mapreduce but replace with the
o.a.h.h.m
 
 By default, MapReduce jobs deployed to a MapReduce cluster do not have access to either the
HBase configuration under `$HBASE_CONF_DIR` or the HBase classes.
 
-To give the MapReduce jobs the access they need, you could add _hbase-site.xml_ to the _$HADOOP_HOME/conf/_
directory and add the HBase JARs to the _HADOOP_HOME/conf/_ directory, then copy these changes
across your cluster.
-You could add _hbase-site.xml_ to _$HADOOP_HOME/conf_ and add HBase jars to the _$HADOOP_HOME/lib_
directory.
-You would then need to copy these changes across your cluster or edit _$HADOOP_HOMEconf/hadoop-env.sh_
and add them to the `HADOOP_CLASSPATH` variable.
+To give the MapReduce jobs the access they need, you could add _hbase-site.xml_ to _$HADOOP_HOME/conf_
and add HBase jars to the _$HADOOP_HOME/lib_ directory.
+You would then need to copy these changes across your cluster. Or you can edit _$HADOOP_HOME/conf/hadoop-env.sh_
and add them to the `HADOOP_CLASSPATH` variable.
 However, this approach is not recommended because it will pollute your Hadoop install with
HBase references.
 It also requires you to restart the Hadoop cluster before Hadoop can use the HBase data.
 
+The recommended approach is to let HBase add its dependency jars itself and use `HADOOP_CLASSPATH`
or `-libjars`.
+
 Since HBase 0.90.x, HBase adds its dependency JARs to the job configuration itself.
 The dependencies only need to be available on the local `CLASSPATH`.
-The following example runs the bundled HBase link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html[RowCounter]
MapReduce job against a table named `usertable` If you have not set the environment variables
expected in the command (the parts prefixed by a `$` sign and curly braces), you can use the
actual system paths instead.
+The following example runs the bundled HBase link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html[RowCounter]
MapReduce job against a table named `usertable`.
+If you have not set the environment variables expected in the command (the parts prefixed
by a `$` sign and surrounded by curly braces), you can use the actual system paths instead.
 Be sure to use the correct version of the HBase JAR for your system.
-The backticks (``` symbols) cause ths shell to execute the sub-commands, setting the `CLASSPATH`
as part of the command.
+The backticks (``` symbols) cause ths shell to execute the sub-commands, setting the output
of `hbase classpath` (the command to dump HBase CLASSPATH) to `HADOOP_CLASSPATH`.
 This example assumes you use a BASH-compatible shell.
 
 [source,bash]
 ----
-$ HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server-VERSION.jar
rowcounter usertable
+$ HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/lib/hbase-server-VERSION.jar
rowcounter usertable
 ----
 
 When the command runs, internally, the HBase JAR finds the dependencies it needs for ZooKeeper,
Guava, and its other dependencies on the passed `HADOOP_CLASSPATH` and adds the JARs to the
MapReduce job configuration.
 See the source at `TableMapReduceUtil#addDependencyJars(org.apache.hadoop.mapreduce.Job)`
for how this is done.
 
+The command `hbase mapredcp` can also help you dump the CLASSPATH entries required by MapReduce,
which are the same jars `TableMapReduceUtil#addDependencyJars` would add.
+You can add them together with HBase conf directory to `HADOOP_CLASSPATH`.
+For jobs that do not package their dependencies or call `TableMapReduceUtil#addDependencyJars`,
the following command structure is necessary:
+
+[source,bash]
+----
+$ HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase mapredcp`:${HBASE_HOME}/conf hadoop jar MyApp.jar
MyJobMainClass -libjars $(${HBASE_HOME}/bin/hbase mapredcp | tr ':' ',') ...
+----
+
 [NOTE]
 ====
 The example may not work if you are running HBase from its build directory rather than an
installed location.
@@ -85,11 +96,11 @@ If this occurs, try modifying the command as follows, so that it uses
the HBase
 
 [source,bash]
 ----
-$ HADOOP_CLASSPATH=${HBASE_HOME}/hbase-server/target/hbase-server-VERSION-SNAPSHOT.jar:`${HBASE_HOME}/bin/hbase
classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server/target/hbase-server-VERSION-SNAPSHOT.jar
rowcounter usertable
+$ HADOOP_CLASSPATH=${HBASE_BUILD_HOME}/hbase-server/target/hbase-server-VERSION-SNAPSHOT.jar:`${HBASE_BUILD_HOME}/bin/hbase
classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_BUILD_HOME}/hbase-server/target/hbase-server-VERSION-SNAPSHOT.jar
rowcounter usertable
 ----
 ====
 
-.Notice to MapReduce users of HBase 0.96.1 and above
+.Notice to MapReduce users of HBase between 0.96.1 and 0.98.4
 [CAUTION]
 ====
 Some MapReduce jobs that use HBase fail to launch.


Mime
View raw message