hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rahul Singh <smart.rahul.i...@gmail.com>
Subject Re: using "-libjars" in Hadoop 2.2.1
Date Wed, 16 Apr 2014 15:39:22 GMT
any help...all are welcome?


On Wed, Apr 16, 2014 at 1:13 PM, Rahul Singh <smart.rahul.iiit@gmail.com>wrote:

> Hi,
>  I am running with the following command but still, jar is not available
> to mapper and reducers.
>
> hadoop jar /home/hduser/workspace/Minerva.jar my.search.Minerva
> /user/hduser/input_minerva_actual /user/hduser/output_merva_actual3
> -libjars /home/hduser/Documents/Lib/json-simple-1.1.1.jar
> -Dmapreduce.user.classpath.first=true
>
>
> Error Log
>
> 14/04/16 13:08:37 INFO client.RMProxy: Connecting to ResourceManager at /
> 0.0.0.0:8032
> 14/04/16 13:08:37 INFO client.RMProxy: Connecting to ResourceManager at /
> 0.0.0.0:8032
> 14/04/16 13:08:37 WARN mapreduce.JobSubmitter: Hadoop command-line option
> parsing not performed. Implement the Tool interface and execute your
> application with ToolRunner to remedy this.
> 14/04/16 13:08:37 INFO mapred.FileInputFormat: Total input paths to
> process : 1
> 14/04/16 13:08:37 INFO mapreduce.JobSubmitter: number of splits:10
> 14/04/16 13:08:37 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_1397534064728_0028
> 14/04/16 13:08:38 INFO impl.YarnClientImpl: Submitted application
> application_1397534064728_0028
> 14/04/16 13:08:38 INFO mapreduce.Job: The url to track the job:
> http://L-Rahul-Tech:8088/proxy/application_1397534064728_0028/
> 14/04/16 13:08:38 INFO mapreduce.Job: Running job: job_1397534064728_0028
> 14/04/16 13:08:47 INFO mapreduce.Job: Job job_1397534064728_0028 running
> in uber mode : false
> 14/04/16 13:08:47 INFO mapreduce.Job:  map 0% reduce 0%
> 14/04/16 13:08:58 INFO mapreduce.Job: Task Id :
> attempt_1397534064728_0028_m_000005_0, Status : FAILED
> Error: java.lang.RuntimeException: Error in configuring object
>     at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
>     at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>     at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:416)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.reflect.InvocationTargetException
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:622)
>     at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
>     ... 9 more
> Caused by: java.lang.NoClassDefFoundError:
> org/json/simple/parser/ParseException
>     at java.lang.Class.forName0(Native Method)
>     at java.lang.Class.forName(Class.java:270)
>     at
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1821)
>     at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1786)
>     at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1880)
>     at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1906)
>     at org.apache.hadoop.mapred.JobConf.getMapperClass(JobConf.java:1107)
>     at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
>     ... 14 more
> Caused by: java.lang.ClassNotFoundException:
> org.json.simple.parser.ParseException
>     at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>     at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
>     ... 22 more
>
> When i analyzed the logs it says
> "14/04/16 13:08:37 WARN mapreduce.JobSubmitter: Hadoop command-line option
> parsing not performed. Implement the Tool interface and execute your
> application with ToolRunner to remedy this."
>
> But i have implemented the tool class as described below:
>
> package my.search;
>
> import org.apache.hadoop.conf.Configured;
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.io.Text;
> import org.apache.hadoop.mapred.FileInputFormat;
> import org.apache.hadoop.mapred.FileOutputFormat;
> import org.apache.hadoop.mapred.JobClient;
> import org.apache.hadoop.mapred.JobConf;
> import org.apache.hadoop.mapred.TextInputFormat;
> import org.apache.hadoop.mapred.TextOutputFormat;
> import org.apache.hadoop.util.Tool;
> import org.apache.hadoop.util.ToolRunner;
>
> public class Minerva extends Configured implements Tool
> {
>     public int run(String[] args) throws Exception {
>         JobConf conf = new JobConf(Minerva.class);
>         conf.setJobName("minerva sample job");
>
>         conf.setMapOutputKeyClass(Text.class);
>         conf.setMapOutputValueClass(TextArrayWritable.class);
>
>         conf.setOutputKeyClass(Text.class);
>         conf.setOutputValueClass(Text.class);
>
>         conf.setMapperClass(Map.class);
>         // conf.setCombinerClass(Reduce.class);
>         conf.setReducerClass(Reduce.class);
>
>         conf.setInputFormat(TextInputFormat.class);
>         conf.setOutputFormat(TextOutputFormat.class);
>
>         FileInputFormat.setInputPaths(conf, new Path(args[0]));
>         FileOutputFormat.setOutputPath(conf, new Path(args[1]));
>
>         JobClient.runJob(conf);
>
>         return 0;
>     }
>
>     public static void main(String[] args) throws Exception {
>         int res = ToolRunner.run(new Minerva(), args);
>         System.exit(res);
>     }
> }
>
>
> Please let me know if you see any issues?
>
>
>
> On Thu, Apr 10, 2014 at 9:29 AM, Shengjun Xin <sxin@gopivotal.com> wrote:
>
>> add '-Dmapreduce.user.classpath.first=true' to your command and try again
>>
>>
>>
>> On Wed, Apr 9, 2014 at 6:27 AM, Kim Chew <kchew534@gmail.com> wrote:
>>
>>> It seems to me that in Hadoop 2.2.1, using the "libjars" option does not
>>> search the jars located in the the local file system but HDFS. For example,
>>>
>>> hadoop jar target/myJar.jar Foo -libjars
>>> /home/kchew/test-libs/testJar.jar /user/kchew/inputs/raw.vector
>>> /user/kchew/outputs hdfs://remoteNN:8020 remoteJT:8021
>>>
>>> 14/04/08 15:11:02 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>> processName=JobTracker, sessionId=
>>> 14/04/08 15:11:02 INFO mapreduce.JobSubmitter: Cleaning up the staging
>>> area
>>> file:/tmp/hadoop-kchew/mapred/staging/kchew202924688/.staging/job_local202924688_0001
>>> 14/04/08 15:11:02 ERROR security.UserGroupInformation:
>>> PriviledgedActionException as:kchew (auth:SIMPLE)
>>> cause:java.io.FileNotFoundException: File does not exist:
>>> hdfs://remoteNN:8020/home/kchew/test-libs/testJar.jar
>>> java.io.FileNotFoundException: File does not exist:
>>> hdfs:/remoteNN:8020/home/kchew/test-libs/testJar.jar
>>>     at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1110)
>>>     at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
>>>     at
>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>     at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
>>>     at
>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>>     at
>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>>     at
>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
>>>     at
>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>>     at
>>> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:264)
>>>
>>> So under Hadoop 2.2.1, do I have to explicitly set some configurations
>>> so when using the "libjars" option it will copy the file to hdfs from local
>>> fs?
>>>
>>> TIA
>>>
>>> Kim
>>>
>>
>>
>>
>> --
>> Regards
>> Shengjun
>>
>
>

Mime
View raw message