hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Utkarsh Gupta <Utkarsh_Gu...@infosys.com>
Subject RE: Including third party jar files in Map Reduce job
Date Wed, 04 Apr 2012 13:13:09 GMT
Hi Harsh,

This worked this was exactly what I was looking for.
The warning has gone and now I can add third party jar files using DistributedCache.addFileToClassPath()
method.
Now there is no need to copy jar to each node's $HADOOP_HOME/lib folder

Thanks a lot
Utkarsh

-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com] 
Sent: Wednesday, April 04, 2012 6:32 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: Including third party jar files in Map Reduce job

When using Tool, do not use:

Configuration conf = new Configuration();

Instead get config from the class:

Configuration conf = getConf();

This is documented at
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/Tool.html

On Wed, Apr 4, 2012 at 6:25 PM, Utkarsh Gupta <Utkarsh_Gupta@infosys.com> wrote:
> Hi Harsh,
> I have implemented Tool like this
>
> public static void main(String[] args) throws Exception {
>        Configuration configuration = new Configuration();
>        int rc = ToolRunner.run(configuration, new WordCount(), args);
>        System.exit(rc);
>    }
>
>    @Override
>    public int run(String[] args) throws Exception {
>        if (args.length < 2) {
>            System.err.println("Usage: WordCount <input path> <output 
> path>");
>            return -1;
>        }
>        Configuration conf = new Configuration();
>        //conf.set("mapred.job.tracker", "local");
>        Job job = new Job(conf, "wordcount");
>
>        job.setJarByClass(WordCount.class);
>        job.setMapperClass(MyMapper.class);
>        job.setReducerClass(MyReducer.class);
>        job.setOutputKeyClass(Text.class);
>        job.setOutputValueClass(IntWritable.class);
>        job.setInputFormatClass(TextInputFormat.class);
>        job.setOutputFormatClass(TextOutputFormat.class);
>        job.setNumReduceTasks(1);
>        FileInputFormat.addInputPath(job, new Path(args[0]));
>        FileOutputFormat.setOutputPath(job, new Path(args[1]));
>        return (job.waitForCompletion(true)) ? 0 : 1;
>    }
>
> This is working but I am unable to figure out why still it is warning 
> -----Original Message-----
> From: Harsh J [mailto:harsh@cloudera.com]
> Sent: Wednesday, April 04, 2012 6:20 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: Re: Including third party jar files in Map Reduce job
>
> Utkarsh,
>
> A log like "12/04/04 15:21:00 WARN mapred.JobClient: Use GenericOptionsParser for parsing
the arguments. Applications should implement Tool for the same." indicates you haven't implemented
the Tool approach properly (or aren't calling its run()).
>
> On Wed, Apr 4, 2012 at 5:25 PM, Utkarsh Gupta <Utkarsh_Gupta@infosys.com> wrote:
>> Hi Devaraj,
>>
>> The code is running now after copying jar @ each node.
>> I might be doing some mistake previously.
>> Thanks Devaraj and Bejoy :)
>>
>>
>> -----Original Message-----
>> From: Devaraj k [mailto:devaraj.k@huawei.com]
>> Sent: Wednesday, April 04, 2012 2:08 PM
>> To: mapreduce-user@hadoop.apache.org
>> Subject: RE: Including third party jar files in Map Reduce job
>>
>> As Bejoy mentioned,
>>
>> If you have copied the jar to $HADOOP_HOME, then you should copy it 
>> to all the nodes in the cluster. (or)
>>
>> If you want to make use of -libjar option, your application should implement Tool
to support generic options. Please check the below link for more details.
>>
>> http://hadoop.apache.org/common/docs/current/commands_manual.html#jar
>>
>> Thanks
>> Devaraj
>> ________________________________________
>> From: Bejoy Ks [bejoy.hadoop@gmail.com]
>> Sent: Wednesday, April 04, 2012 1:06 PM
>> To: mapreduce-user@hadoop.apache.org
>> Subject: Re: Including third party jar files in Map Reduce job
>>
>> Hi Utkarsh
>>         You can add third party jars to your map reduce job elegantly 
>> in the following ways
>>
>> 1) use - libjars
>> hadoop jar jarName.jar com.driver.ClassName -libjars /home/some/dir/somejar.jar ....
>>
>> 2) include the third pary jars in /lib folder while packaging your 
>> application
>>
>> 3) If you are adding the jar in HADOOP_HOME/lib , you need to add this at all nodes.
>>
>> Regards
>> Bejoy KS
>>
>> On Wed, Apr 4, 2012 at 12:55 PM, Utkarsh Gupta <Utkarsh_Gupta@infosys.com<mailto:Utkarsh_Gupta@infosys.com>>
wrote:
>> Hi Devaraj,
>>
>> I have already copied the required jar file in $HADOOP_HOME/lib folder.
>> Can you tell me where to add generic option -libjars
>>
>> The stack trace is:
>> hadoop$ bin/hadoop jar WordCount.jar /user/hduser1/input/ 
>> /user/hduser1/output
>> 12/04/04 12:45:51 WARN mapred.JobClient: Use GenericOptionsParser for parsing the
arguments. Applications should implement Tool for the same.
>> 12/04/04 12:45:51 INFO input.FileInputFormat: Total input paths to 
>> process : 1
>> 12/04/04 12:45:51 INFO mapred.JobClient: Running job:
>> job_201204041107_0005
>> 12/04/04 12:45:52 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/04/04 12:46:07 INFO mapred.JobClient: Task Id :
>> attempt_201204041107_0005_m_000000_0, Status : FAILED
>> Error: java.lang.ClassNotFoundException:
>> org.apache.commons.math3.random.RandomDataImpl
>>       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>       at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>>       at 
>> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>       at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>>       at wordcount.MyMapper.map(MyMapper.java:22)
>>       at wordcount.MyMapper.map(MyMapper.java:14)
>>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>       at
>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>>       at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at javax.security.auth.Subject.doAs(Subject.java:396)
>>       at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
>> t
>> ion.java:1059)
>>       at org.apache.hadoop.mapred.Child.main(Child.java:253)
>>
>> Thanks and Regards
>> Utkarsh
>>
>> -----Original Message-----
>> From: Devaraj k
>> [mailto:devaraj.k@huawei.com<mailto:devaraj.k@huawei.com>]
>> Sent: Wednesday, April 04, 2012 12:35 PM
>> To:
>> mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.
>> o
>> rg>
>> Subject: RE: Including third party jar files in Map Reduce job
>>
>> Hi Utkarsh,
>>
>> The usage of the jar command is like this,
>>
>> Usage: hadoop jar <jar> [mainClass] args...
>>
>> If you want the commons-math3.jar to be available for all the tasks you can do any
one of these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2. Use the generic option -libjars.
>>
>> Can you give the stack trace of your problem for which class it is giving ClassNotFoundException(i.e
for main class or math lib class)?
>>
>> Thanks
>> Devaraj
>> ________________________________________
>> From: Utkarsh Gupta
>> [Utkarsh_Gupta@infosys.com<mailto:Utkarsh_Gupta@infosys.com>]
>> Sent: Wednesday, April 04, 2012 12:22 PM
>> To:
>> mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.
>> o
>> rg>
>> Subject: Including third party jar files in Map Reduce job
>>
>> Hi All,
>>
>> I am new to Hadoop and was trying to generate random numbers using apache commons
math library.
>> I used Netbeans to build the jar file and the manifest has path to commons-math jar
as lib/commons-math3.jar I have placed this jar file in HADOOP_HOME/lib folder but still I
am getting ClassNotFoundException.
>> I tried using -libjars option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath>
<outputpath> -libjars <jarpath> And $HADOOP_HOME/bin/Hadoop jar myprg.jar -libjar
<jarpath> <inputpath> <outputpath> But this is not working. Please help.
>>
>>
>> Thanks and Regards
>> Utkarsh Gupta
>>
>>
>>
>> **************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED
AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not
the intended recipient, please notify the sender by e-mail and delete the original message.
Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other
person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage you may
sustain as a result of any virus in this e-mail. You should carry out your own virus checks
before opening the e-mail or attachment. Infosys reserves the right to monitor and review
the content of all messages sent to or from this e-mail address. Messages sent to or from
this e-mail address may be stored on the Infosys e-mail system.
>> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>>
>>
>>
>
>
>
> --
> Harsh J



--
Harsh J

Mime
View raw message