hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: Some HBase M/R confusion
Date Thu, 23 Feb 2012 05:14:53 GMT
Thanks Stack.

Missed the "file:" part in the first case... Stupid... Must pickup an hbase-site.xml from
somewhere else (or more likely just using the defaults, because it can't fine one). 

Either way we need to update the book I think.

As for the protobufs. This is trunk, and it looks like this is related to HBASE-5394. Happens
also in the non-secure branch.
Filed HBASE-5460. I assume we just add protobufs as jar dependency, I will do that tonight.

-- Lars

 From: Stack <stack@duboce.net>
To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com> 
Sent: Wednesday, February 22, 2012 8:59 PM
Subject: Re: Some HBase M/R confusion
On Wed, Feb 22, 2012 at 6:36 PM, lars hofhansl <lhofhansl@yahoo.com> wrote:
> 1. The HBase book states to run M/R jobs like export here: http://hbase.apache.org/book/ops_mgt.html#export
> bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir>
[<versions> [<starttime> [<endtime>]]]

This is running the Export tool, i.e. the Export class's main.  The
CLASSPATH is that built by bin/hbase.

> 2. Whereas the Javadoc says here: http://hbase.apache.org/docs/current/api/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description
> HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.0.jar
export ...

Here we're loading the HADOOP_CLASSPATH with hbase classpath.  We then
pass the hbase.jar as a 'mapreduce fat jar' for bin/hadoop to run.
Our hbase.jar, when we make it, we set its Main-Class to be the Driver
class under mapreduce.  In here, it parses args to figure which of our
selection of common mapreduce programs to run.  Here you've chosen
export (leave off the 'export' arg to see the complete list).

Either means should work but #2 is a bit more palatable (excepting the
ugly CLASSPATH preamble).

> In the first case (#1) I find that the job allways fails to create the output dir:
> java.io.IOException: Mkdirs failed to create file:/exports/_temporary/_attempt_local_0001_m_000000_0
>     at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:378)

Its running local?   Its trying to write to /export on your local
disk?  Its probably not picking up hadoop configs and so is using
local mapreducing.

> In the 2nd case (#2) I get past the creation of the output dir, and then it fails because
it cannot find class com.google.protobuf.Message.

Its not adding protobufs to CLASSPATH?  Or versions disagree?   The
hbase included protobufs is being found first and its not what Hadoop
protobuffing wants?

> I am using the HBase security branch and find that I need to add com.google.protobuf.Message.class
in TableMapReduceUtil.addDependencyJars.
> If I do that, I can successfully run an export jobs using method #2.

This is probably a bug.

This is 0.92.x?  Or trunk?  The protobufs is a new dependency hbase needs?

> The 2nd issue I found looks like a bug with the HBase security branch.
> I am not sure about the first issue, is the documentation in the HBase book outdated?

I think yeah, we should encourage #2; e.g. we'll use the proper config
and find the cluster.  Would have to add hadoop config. to #1 to make
it work.

My guess is its not just security branch.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message