hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niels Basjes <Ni...@basjes.nl>
Subject Including external libraries in my job.
Date Tue, 03 May 2011 13:42:35 GMT
Hi,

I've written my first very simple job that does something with hbase.

Now when I try to submit my jar in my cluster I get this:

[nbasjes@master ~/src/catalogloader/run]$ hadoop jar
catalogloader-1.0-SNAPSHOT.jar nl.basjes.catalogloader.Loader
/user/nbasjes/Minicatalog.xml
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/hbase/HBaseConfiguration
        at nl.basjes.catalogloader.Loader.main(Loader.java:156)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
...

I've found this blog post that promises help
http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/

Quote:
    "1. Include the JAR in the “-libjars” command line option of the
`hadoop jar …` command. The jar will be placed in distributed cache
and will be made available to all of the job’s task attempts. "

However one of the comments states:
    "Unfortunately, method 1 only work before 0.18, it doesn’t work in 0.20."

Indeed, I can't get it to work this way.

I've tried something as simple as:
export HADOOP_CLASSPATH=/usr/lib/hbase/hbase-0.90.1-cdh3u0.jar:/usr/lib/zookeeper/zookeeper-3.3.3-cdh3u0.jar
and then run the job but that (as expected) simply means the tasks on
the processing nodes fail with a similar error:
java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapreduce.TableOutputFormat
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:996)
        at org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:248)
        at org.apache.hadoop.mapred.Task.initialize(Task.java:486)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
...

So what is the correct way of doing this?

-- 
Met vriendelijke groeten,

Niels Basjes

Mime
View raw message