crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Quentin Ambard <quentin.amb...@gmail.com>
Subject Re: mr.MRPipeline error running job : java.io.IOException: No such file or directory
Date Wed, 15 May 2013 15:02:23 GMT
Well I finally find out..
My jar was owned by a different unix user than the one launching the job,
that's why I had the error !


2013/5/15 Quentin Ambard <quentin.ambard@gmail.com>

> Hi,
> Thanks for your answers.
> - crunch tmp dir permissions are fine, crunch create a new folder inside
> everytime I launch the batch
> - crunch example jar  (wordcount)
> - hbase connection is OK (I scan the table)
> - I updated to crunch 0.6.0, logs are enabled and I have now more
> information about the error.
> It looks like it can't find some Hbase dependency jars  to ship them to
> the cluster. I think all necessary dependancies are packaged inside my jar.
> All hadoop depandancies are from the cloudera repository (so I don't think
> it's a version issue).
>
> Any ideas ?
>
> Exception in thread "main" org.apache.crunch.CrunchRuntimeException:
> java.io.IOException: java.lang.RuntimeException: java.io.IOException: No
> such file or directory
>  at org.apache.crunch.impl.mr.MRPipeline.plan(MRPipeline.java:153)
> at org.apache.crunch.impl.mr.MRPipeline.runAsync(MRPipeline.java:172)
>  at org.apache.crunch.impl.mr.MRPipeline.run(MRPipeline.java:160)
> at org.apache.crunch.impl.mr.MRPipeline.done(MRPipeline.java:181)
>  at
> com.myprocurement.crunch.job.extractor.ExtractAndConcatJob.run(ExtractAndConcatJob.java:102)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at
> com.myprocurement.crunch.job.fullpage.CrunchLauncher.launch(CrunchLauncher.java:40)
> at com.myprocurement.crunch.BatchMain.main(BatchMain.java:31)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
>  at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
> Caused by: java.io.IOException: java.lang.RuntimeException:
> java.io.IOException: No such file or directory
>  at
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:521)
> at
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:472)
>  at
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:438)
> at
> org.apache.crunch.io.hbase.HBaseSourceTarget.configureSource(HBaseSourceTarget.java:100)
>  at
> org.apache.crunch.impl.mr.plan.JobPrototype.build(JobPrototype.java:192)
> at
> org.apache.crunch.impl.mr.plan.JobPrototype.getCrunchJob(JobPrototype.java:123)
>  at org.apache.crunch.impl.mr.plan.MSCRPlanner.plan(MSCRPlanner.java:159)
> at org.apache.crunch.impl.mr.MRPipeline.plan(MRPipeline.java:151)
>  ... 12 more
> Caused by: java.lang.RuntimeException: java.io.IOException: No such file
> or directory
> at org.apache.hadoop.util.JarFinder.getJar(JarFinder.java:164)
>  at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:518)
>  ... 19 more
> Caused by: java.io.IOException: No such file or directory
> at java.io.UnixFileSystem.createFileExclusively(Native Method)
>  at java.io.File.checkAndCreate(File.java:1705)
> at java.io.File.createTempFile0(File.java:1726)
> at java.io.File.createTempFile(File.java:1803)
>  at org.apache.hadoop.util.JarFinder.getJar(JarFinder.java:156)
> ... 23 more
>
>
>
>
>
>
> 2013/5/13 Josh Wills <jwills@cloudera.com>
>
>> It does sound like a permission issue-- you can set the crunch.tmp.dir
>> property on the commandline (assuming you're implementing the Tool
>> interface) by setting -Dcrunch.tmp.dir=... to see if that helps.
>>
>>
>> On Mon, May 13, 2013 at 5:15 AM, Christian Tzolov <tzolov@apache.org>wrote:
>>
>>> You can try MRPipelien.enableDebug() to lower the log level.
>>>
>>>
>>> On Mon, May 13, 2013 at 12:06 PM, Quentin Ambard <
>>> quentin.ambard@gmail.com> wrote:
>>>
>>>> The problem is that I d'ont see my job on the JobTracker page. It's
>>>> like the job don't even start !
>>>> Is there a way to improve log level to get more information on the
>>>> error ?
>>>>
>>>>
>>>> 2013/5/12 Josh Wills <jwills@cloudera.com>
>>>>
>>>>> Something probably failed in the MapReduce job itself, which meant
>>>>> that there weren't any outputs for Crunch to move around. What do the
error
>>>>> logs for the individual tasks look like on the JobTracker status page(s)?
>>>>>
>>>>>
>>>>> On Sat, May 11, 2013 at 5:02 PM, Quentin Ambard <
>>>>> quentin.ambard@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>> I'm running a simple job on hadoop cdh 4.1.2 based on crunch.
>>>>>> The job is quite simple : it scan a hbase table,  extract some data
>>>>>> from each entry of hbase, group the result by key and combine them
using an
>>>>>> aggreator, then write it back to another hbase table.
>>>>>> It works fine on my computer, however when I try to launch it on
my
>>>>>> hadoop cluster I get the following :
>>>>>>
>>>>>> >>hadoop jar uber-crunch-1.0-SNAPSHOT.jar description
>>>>>> /home/quentin/default.properties
>>>>>> 13/05/12 01:57:50 INFO support.ClassPathXmlApplicationContext:
>>>>>> Refreshing
>>>>>> org.springframework.context.support.ClassPathXmlApplicationContext@1f4384c2:
>>>>>> startup date [Sun May 12 01:57:50 CEST 2013]; root of context hierarchy
>>>>>> 13/05/12 01:57:50 INFO xml.XmlBeanDefinitionReader: Loading XML bean
>>>>>> definitions from class path resource [context/job-description-context.xml]
>>>>>> 13/05/12 01:57:50 INFO xml.XmlBeanDefinitionReader: Loading XML bean
>>>>>> definitions from class path resource [context/default-batch-context.xml]
>>>>>> 13/05/12 01:57:51 INFO annotation.ClassPathBeanDefinitionScanner:
>>>>>> JSR-330 'javax.inject.Named' annotation found and supported for component
>>>>>> scanning
>>>>>> 13/05/12 01:57:51 INFO config.PropertyPlaceholderConfigurer: Loading
>>>>>> properties file from URL
>>>>>> [file:/tmp/hadoop-hdfs/hadoop-unjar7637839123250781784/default.properties]
>>>>>> 13/05/12 01:57:51 INFO config.PropertyPlaceholderConfigurer: Loading
>>>>>> properties file from URL
>>>>>> [jar:file:/home/quentin/uber-crunch-1.0-SNAPSHOT.jar!/default.properties]
>>>>>> 13/05/12 01:57:51 INFO config.PropertyPlaceholderConfigurer: Loading
>>>>>> properties file from URL [file:/home/quentin/default.properties]
>>>>>> 13/05/12 01:57:51 INFO
>>>>>> annotation.AutowiredAnnotationBeanPostProcessor: JSR-330
>>>>>> 'javax.inject.Inject' annotation found and supported for autowiring
>>>>>> 13/05/12 01:57:51 INFO support.DefaultListableBeanFactory:
>>>>>> Pre-instantiating singletons in
>>>>>> org.springframework.beans.factory.support.DefaultListableBeanFactory@5b7b0998:
>>>>>> defining beans
>>>>>> [org.springframework.beans.factory.config.PropertyPlaceholderConfigurer#0,applicationContextHolder,descriptionLauncher,descriptionExtractor,emailExtractor,rawTextExtractor,keywordsExtractor,org.springframework.context.annotation.internalConfigurationAnnotationProcessor,org.springframework.context.annotation.internalAutowiredAnnotationProcessor,org.springframework.context.annotation.internalRequiredAnnotationProcessor,org.springframework.context.annotation.internalCommonAnnotationProcessor,org.springframework.context.annotation.ConfigurationClassPostProcessor$ImportAwareBeanPostProcessor#0];
>>>>>> root of factory hierarchy
>>>>>> 13/05/12 01:57:52 INFO hbase.HBaseTarget: HBaseTarget ignores checks
>>>>>> for existing outputs...
>>>>>> 13/05/12 01:57:53 INFO collect.PGroupedTableImpl: Setting num reduce
>>>>>> tasks to 2
>>>>>> 13/05/12 01:57:53 ERROR mr.MRPipeline:
>>>>>> org.apache.crunch.CrunchRuntimeException: java.io.IOException:
>>>>>> java.lang.RuntimeException: java.io.IOException: No such file or
directory
>>>>>> 13/05/12 01:57:53 WARN mr.MRPipeline: Not running cleanup while
>>>>>> output targets remain
>>>>>>
>>>>>> Any idea of the origin of the problem ? Maybe it's something with
>>>>>> permissions or a crunch tmp file, but I can't find out where it come
from
>>>>>>
>>>>>> Thanks for your help
>>>>>>
>>>>>>
>>>>>> Quentin
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Director of Data Science
>>>>> Cloudera <http://www.cloudera.com>
>>>>> Twitter: @josh_wills <http://twitter.com/josh_wills>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Quentin Ambard
>>>>
>>>
>>>
>>
>>
>> --
>> Director of Data Science
>> Cloudera <http://www.cloudera.com>
>> Twitter: @josh_wills <http://twitter.com/josh_wills>
>>
>
>
>
> --
> Quentin Ambard
>



-- 
Quentin Ambard

Mime
View raw message