crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Quentin Ambard <quentin.amb...@gmail.com>
Subject Re: mr.MRPipeline error running job : java.io.IOException: No such file or directory
Date Wed, 15 May 2013 10:42:18 GMT
Hi,
Thanks for your answers.
- crunch tmp dir permissions are fine, crunch create a new folder inside
everytime I launch the batch
- crunch example jar  (wordcount)
- hbase connection is OK (I scan the table)
- I updated to crunch 0.6.0, logs are enabled and I have now more
information about the error.
It looks like it can't find some Hbase dependency jars  to ship them to the
cluster. I think all necessary dependancies are packaged inside my jar. All
hadoop depandancies are from the cloudera repository (so I don't think it's
a version issue).

Any ideas ?

Exception in thread "main" org.apache.crunch.CrunchRuntimeException:
java.io.IOException: java.lang.RuntimeException: java.io.IOException: No
such file or directory
 at org.apache.crunch.impl.mr.MRPipeline.plan(MRPipeline.java:153)
at org.apache.crunch.impl.mr.MRPipeline.runAsync(MRPipeline.java:172)
 at org.apache.crunch.impl.mr.MRPipeline.run(MRPipeline.java:160)
at org.apache.crunch.impl.mr.MRPipeline.done(MRPipeline.java:181)
 at
com.myprocurement.crunch.job.extractor.ExtractAndConcatJob.run(ExtractAndConcatJob.java:102)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at
com.myprocurement.crunch.job.fullpage.CrunchLauncher.launch(CrunchLauncher.java:40)
at com.myprocurement.crunch.BatchMain.main(BatchMain.java:31)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.io.IOException: java.lang.RuntimeException:
java.io.IOException: No such file or directory
 at
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:521)
at
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:472)
 at
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:438)
at
org.apache.crunch.io.hbase.HBaseSourceTarget.configureSource(HBaseSourceTarget.java:100)
 at org.apache.crunch.impl.mr.plan.JobPrototype.build(JobPrototype.java:192)
at
org.apache.crunch.impl.mr.plan.JobPrototype.getCrunchJob(JobPrototype.java:123)
 at org.apache.crunch.impl.mr.plan.MSCRPlanner.plan(MSCRPlanner.java:159)
at org.apache.crunch.impl.mr.MRPipeline.plan(MRPipeline.java:151)
 ... 12 more
Caused by: java.lang.RuntimeException: java.io.IOException: No such file or
directory
at org.apache.hadoop.util.JarFinder.getJar(JarFinder.java:164)
 at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:518)
 ... 19 more
Caused by: java.io.IOException: No such file or directory
at java.io.UnixFileSystem.createFileExclusively(Native Method)
 at java.io.File.checkAndCreate(File.java:1705)
at java.io.File.createTempFile0(File.java:1726)
at java.io.File.createTempFile(File.java:1803)
 at org.apache.hadoop.util.JarFinder.getJar(JarFinder.java:156)
... 23 more






2013/5/13 Josh Wills <jwills@cloudera.com>

> It does sound like a permission issue-- you can set the crunch.tmp.dir
> property on the commandline (assuming you're implementing the Tool
> interface) by setting -Dcrunch.tmp.dir=... to see if that helps.
>
>
> On Mon, May 13, 2013 at 5:15 AM, Christian Tzolov <tzolov@apache.org>wrote:
>
>> You can try MRPipelien.enableDebug() to lower the log level.
>>
>>
>> On Mon, May 13, 2013 at 12:06 PM, Quentin Ambard <
>> quentin.ambard@gmail.com> wrote:
>>
>>> The problem is that I d'ont see my job on the JobTracker page. It's like
>>> the job don't even start !
>>> Is there a way to improve log level to get more information on the error
>>> ?
>>>
>>>
>>> 2013/5/12 Josh Wills <jwills@cloudera.com>
>>>
>>>> Something probably failed in the MapReduce job itself, which meant that
>>>> there weren't any outputs for Crunch to move around. What do the error logs
>>>> for the individual tasks look like on the JobTracker status page(s)?
>>>>
>>>>
>>>> On Sat, May 11, 2013 at 5:02 PM, Quentin Ambard <
>>>> quentin.ambard@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>> I'm running a simple job on hadoop cdh 4.1.2 based on crunch.
>>>>> The job is quite simple : it scan a hbase table,  extract some data
>>>>> from each entry of hbase, group the result by key and combine them using
an
>>>>> aggreator, then write it back to another hbase table.
>>>>> It works fine on my computer, however when I try to launch it on my
>>>>> hadoop cluster I get the following :
>>>>>
>>>>> >>hadoop jar uber-crunch-1.0-SNAPSHOT.jar description
>>>>> /home/quentin/default.properties
>>>>> 13/05/12 01:57:50 INFO support.ClassPathXmlApplicationContext:
>>>>> Refreshing
>>>>> org.springframework.context.support.ClassPathXmlApplicationContext@1f4384c2:
>>>>> startup date [Sun May 12 01:57:50 CEST 2013]; root of context hierarchy
>>>>> 13/05/12 01:57:50 INFO xml.XmlBeanDefinitionReader: Loading XML bean
>>>>> definitions from class path resource [context/job-description-context.xml]
>>>>> 13/05/12 01:57:50 INFO xml.XmlBeanDefinitionReader: Loading XML bean
>>>>> definitions from class path resource [context/default-batch-context.xml]
>>>>> 13/05/12 01:57:51 INFO annotation.ClassPathBeanDefinitionScanner:
>>>>> JSR-330 'javax.inject.Named' annotation found and supported for component
>>>>> scanning
>>>>> 13/05/12 01:57:51 INFO config.PropertyPlaceholderConfigurer: Loading
>>>>> properties file from URL
>>>>> [file:/tmp/hadoop-hdfs/hadoop-unjar7637839123250781784/default.properties]
>>>>> 13/05/12 01:57:51 INFO config.PropertyPlaceholderConfigurer: Loading
>>>>> properties file from URL
>>>>> [jar:file:/home/quentin/uber-crunch-1.0-SNAPSHOT.jar!/default.properties]
>>>>> 13/05/12 01:57:51 INFO config.PropertyPlaceholderConfigurer: Loading
>>>>> properties file from URL [file:/home/quentin/default.properties]
>>>>> 13/05/12 01:57:51 INFO
>>>>> annotation.AutowiredAnnotationBeanPostProcessor: JSR-330
>>>>> 'javax.inject.Inject' annotation found and supported for autowiring
>>>>> 13/05/12 01:57:51 INFO support.DefaultListableBeanFactory:
>>>>> Pre-instantiating singletons in
>>>>> org.springframework.beans.factory.support.DefaultListableBeanFactory@5b7b0998:
>>>>> defining beans
>>>>> [org.springframework.beans.factory.config.PropertyPlaceholderConfigurer#0,applicationContextHolder,descriptionLauncher,descriptionExtractor,emailExtractor,rawTextExtractor,keywordsExtractor,org.springframework.context.annotation.internalConfigurationAnnotationProcessor,org.springframework.context.annotation.internalAutowiredAnnotationProcessor,org.springframework.context.annotation.internalRequiredAnnotationProcessor,org.springframework.context.annotation.internalCommonAnnotationProcessor,org.springframework.context.annotation.ConfigurationClassPostProcessor$ImportAwareBeanPostProcessor#0];
>>>>> root of factory hierarchy
>>>>> 13/05/12 01:57:52 INFO hbase.HBaseTarget: HBaseTarget ignores checks
>>>>> for existing outputs...
>>>>> 13/05/12 01:57:53 INFO collect.PGroupedTableImpl: Setting num reduce
>>>>> tasks to 2
>>>>> 13/05/12 01:57:53 ERROR mr.MRPipeline:
>>>>> org.apache.crunch.CrunchRuntimeException: java.io.IOException:
>>>>> java.lang.RuntimeException: java.io.IOException: No such file or directory
>>>>> 13/05/12 01:57:53 WARN mr.MRPipeline: Not running cleanup while output
>>>>> targets remain
>>>>>
>>>>> Any idea of the origin of the problem ? Maybe it's something with
>>>>> permissions or a crunch tmp file, but I can't find out where it come
from
>>>>>
>>>>> Thanks for your help
>>>>>
>>>>>
>>>>> Quentin
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Director of Data Science
>>>> Cloudera <http://www.cloudera.com>
>>>> Twitter: @josh_wills <http://twitter.com/josh_wills>
>>>>
>>>
>>>
>>>
>>> --
>>> Quentin Ambard
>>>
>>
>>
>
>
> --
> Director of Data Science
> Cloudera <http://www.cloudera.com>
> Twitter: @josh_wills <http://twitter.com/josh_wills>
>



-- 
Quentin Ambard

Mime
View raw message