crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gauthier AMBARD <gauthier.amb...@gmail.com>
Subject Re: CrunchRuntimeException: java.io.IOException
Date Tue, 24 Jul 2012 15:48:44 GMT
Yep,
http://apache.mirrors.multidist.eu/hadoop/common/stable/hadoop-1.0.3-bin.tar.gz
and
hadoop version says :
Hadoop 1.0.3
Subversion
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
1335192
Compiled by hortonfo on Tue May  8 20:31:25 UTC 2012
>From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be

Maybe it has to do with some configuration ?

Gauthier

2012/7/24 Josh Wills <jwills@cloudera.com>

> Hey Gauthier,
>
> IIRC, that error occurs when the Hadoop version doesn't support multiple
> output files, which Crunch relies on. My understanding was that this was
> part of 1.0.3, viz.
>
>
> http://hadoop.apache.org/common/docs/r1.0.3/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html
>
> so I'm a bit thrown-- this is the Apache distro of 1.0.3, right? Not a
> custom Hadoop build?
>
> J
>
> On Tue, Jul 24, 2012 at 8:29 AM, Gauthier AMBARD <
> gauthier.ambard@gmail.com> wrote:
>
>> Hi guys,
>>
>> I wanted to use crunch, but when I tried the examples I got
>> : org.apache.crunch.impl.mr.run.CrunchRuntimeException:
>> java.io.IOException: File already
>> exists:file:/tmp/crunch-1094145699/p1/output/_temporary/_attempt_local_0001_r_000000_0/part-r-00000
>>
>> I am running a git (apache incubator) version of crunch (07/24/2012)
>> against a 1.0.3 hadoop (maybe this is causing the error,
>> every dependencies are with 0.20.x hadoop). Or maybe I have messed with my
>> hadoop configuration (but I can run any hadoop example).
>>
>> Regards
>> Gauthier
>>
>> Stack trace :
>>
>> 714  [Thread-15] INFO  org.apache.crunch.impl.mr.run.RTNode  - Crunch
>> exception in 'Text(out)' for input: [(http://www.apache.org/).,1]
>> org.apache.crunch.impl.mr.run.CrunchRuntimeException:
>> java.io.IOException: File already
>> exists:file:/tmp/crunch-1094145699/p1/output/_temporary/_attempt_local_0001_r_000000_0/part-r-00000
>> at
>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:44)
>>  at org.apache.crunch.MapFn.process(MapFn.java:34)
>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:85)
>>  at
>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:43)
>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>  at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:85)
>> at
>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:43)
>>  at
>> org.apache.crunch.CombineFn$AggregatorCombineFn.process(CombineFn.java:87)
>> at
>> org.apache.crunch.CombineFn$AggregatorCombineFn.process(CombineFn.java:72)
>>  at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:85)
>> at
>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:43)
>>  at org.apache.crunch.MapFn.process(MapFn.java:34)
>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:85)
>>  at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:100)
>> at
>> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:61)
>>  at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
>> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
>>  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>> at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
>> Caused by: java.io.IOException: File already
>> exists:file:/tmp/crunch-1094145699/p1/output/_temporary/_attempt_local_0001_r_000000_0/part-r-00000
>> at
>> org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:228)
>>  at
>> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>> at
>> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>>  at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:484)
>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:465)
>>  at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:372)
>> at
>> org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:128)
>>  at
>> org.apache.crunch.hadoop.mapreduce.lib.output.CrunchMultipleOutputs.getRecordWriter(CrunchMultipleOutputs.java:416)
>> at
>> org.apache.crunch.hadoop.mapreduce.lib.output.CrunchMultipleOutputs.write(CrunchMultipleOutputs.java:378)
>>  at
>> org.apache.crunch.hadoop.mapreduce.lib.output.CrunchMultipleOutputs.write(CrunchMultipleOutputs.java:356)
>> at
>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:42)
>>
>
>
>
> --
> Director of Data Science
> Cloudera <http://www.cloudera.com>
> Twitter: @josh_wills <http://twitter.com/josh_wills>
>
>

Mime
View raw message