From Utkarsh Gupta <Utkarsh_Gu...@infosys.com>
Subject RE: Including third party jar files in Map Reduce job
Date Wed, 04 Apr 2012 11:55:07 GMT
Hi Devaraj,

The code is running now after copying jar @ each node.
I might be doing some mistake previously.
Thanks Devaraj and Bejoy :)

-----Original Message-----
From: Devaraj k [mailto:devaraj.k@huawei.com] 
Sent: Wednesday, April 04, 2012 2:08 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: Including third party jar files in Map Reduce job

As Bejoy mentioned,

If you have copied the jar to $HADOOP_HOME, then you should copy it to all the nodes in the
cluster. (or)

If you want to make use of -libjar option, your application should implement Tool to support
generic options. Please check the below link for more details.


From: Bejoy Ks [bejoy.hadoop@gmail.com]
Sent: Wednesday, April 04, 2012 1:06 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: Including third party jar files in Map Reduce job

Hi Utkarsh
         You can add third party jars to your map reduce job elegantly in the following ways

1) use - libjars
hadoop jar jarName.jar com.driver.ClassName -libjars /home/some/dir/somejar.jar ....

2) include the third pary jars in /lib folder while packaging your application

3) If you are adding the jar in HADOOP_HOME/lib , you need to add this at all nodes.

Bejoy KS

On Wed, Apr 4, 2012 at 12:55 PM, Utkarsh Gupta <Utkarsh_Gupta@infosys.com<mailto:Utkarsh_Gupta@infosys.com>>
Hi Devaraj,

I have already copied the required jar file in $HADOOP_HOME/lib folder.
Can you tell me where to add generic option -libjars

The stack trace is:
hadoop$ bin/hadoop jar WordCount.jar /user/hduser1/input/ /user/hduser1/output
12/04/04 12:45:51 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments.
Applications should implement Tool for the same.
12/04/04 12:45:51 INFO input.FileInputFormat: Total input paths to process : 1
12/04/04 12:45:51 INFO mapred.JobClient: Running job: job_201204041107_0005
12/04/04 12:45:52 INFO mapred.JobClient:  map 0% reduce 0%
12/04/04 12:46:07 INFO mapred.JobClient: Task Id : attempt_201204041107_0005_m_000000_0, Status
Error: java.lang.ClassNotFoundException: org.apache.commons.math3.random.RandomDataImpl
       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
       at java.security.AccessController.doPrivileged(Native Method)
       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
       at wordcount.MyMapper.map(MyMapper.java:22)
       at wordcount.MyMapper.map(MyMapper.java:14)
       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
       at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
       at java.security.AccessController.doPrivileged(Native Method)
       at javax.security.auth.Subject.doAs(Subject.java:396)
       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
       at org.apache.hadoop.mapred.Child.main(Child.java:253)

Thanks and Regards

-----Original Message-----
From: Devaraj k [mailto:devaraj.k@huawei.com<mailto:devaraj.k@huawei.com>]
Sent: Wednesday, April 04, 2012 12:35 PM
To: mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.org>
Subject: RE: Including third party jar files in Map Reduce job

Hi Utkarsh,

The usage of the jar command is like this,

Usage: hadoop jar <jar> [mainClass] args...

If you want the commons-math3.jar to be available for all the tasks you can do any one of
these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2. Use the generic option -libjars.

Can you give the stack trace of your problem for which class it is giving ClassNotFoundException(i.e
for main class or math lib class)?

From: Utkarsh Gupta [Utkarsh_Gupta@infosys.com<mailto:Utkarsh_Gupta@infosys.com>]
Sent: Wednesday, April 04, 2012 12:22 PM
To: mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.org>
Subject: Including third party jar files in Map Reduce job

Hi All,

I am new to Hadoop and was trying to generate random numbers using apache commons math library.
I used Netbeans to build the jar file and the manifest has path to commons-math jar as lib/commons-math3.jar
I have placed this jar file in HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
I tried using -libjars option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath>
<outputpath> -libjars <jarpath> And $HADOOP_HOME/bin/Hadoop jar myprg.jar -libjar
<jarpath> <inputpath> <outputpath> But this is not working. Please help.

Thanks and Regards
Utkarsh Gupta

