crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <jwi...@cloudera.com>
Subject Re: writeTextFile gives back exception
Date Thu, 21 Jun 2012 14:31:49 GMT
Yes, they really should. I'll fix the MemPipeline one to be able to
correctly write output to directories.

On Thu, Jun 21, 2012 at 3:23 AM, Rahul Sharma <rahul0208@gmail.com> wrote:
> Hi Everyone,
>
> I believe, Pipeline types are not completely inter-changeable. I wrote
> testcases for MRPipeline but the I changed the type to MemPipeiine.
> All things went fine but while creating the output file using
> writeTextFile, it gave an error with the following stacktrace :
>
> 1    [main] ERROR com.cloudera.crunch.impl.mem.MemPipeline  -
> Exception writing target: Text(/home/rahul/crunchOut)
> java.io.FileNotFoundException: /home/rahul/crunchOut (Is a directory)
>        at java.io.FileOutputStream.open(Native Method)
>        at java.io.FileOutputStream.<init>(FileOutputStream.java:194)
>        at org.apache.hadoop.fs.RawLocalFileSystem
> $LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:189)
>        at org.apache.hadoop.fs.RawLocalFileSystem
> $LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:185)
>        at
> org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:
> 256)
>        at
> org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:
> 237)
>        at org.apache.hadoop.fs.ChecksumFileSystem
> $ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:336)
>        at
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:
> 382)
>        at
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:
> 365)
>        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:584)
>        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:565)
>        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:472)
>        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:464)
>        at com.cloudera.crunch.impl.mem.MemPipeline.write(MemPipeline.java:
> 148)
>        at
> com.cloudera.crunch.impl.mem.MemPipeline.writeTextFile(MemPipeline.java:
> 178)
>
>
> Now, when I looked it out, basically the code there  in the
> writeTextFile function expects a file while I was passing a folder,
> which is required for the MRPipeline. If I pass a file location in
> MemPipeline it works but breaks for MRPipeline stating back the
> following exception :
>
> 1 job failure(s) occurred:
> com.mylearning.crunch.FirstTest: SeqFile(/tmp/crunch1711673673/
> p1)+top1map+GBK+combine+top1reduce+asText+Text(/home/rahul/crunchOut/
> sample.txt)(class com.mylearning.crunch.FirstTest0):
> java.io.IOException: Mkdirs failed to create /home/rahul/crunchOut/
> sample.txt
>        at
> org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:
> 253)
>        at
> org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:
> 237)
>        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:565)
>        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:472)
>        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:223)
>        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>        at
> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:
> 287)
>        at
> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:
> 429)
>
> Basically internally, the getDestFile(Path src, Path dir, int index)
> in crunchJob class expects the path to be directory and not a file.
>
> Shouldn't the two implementations for writeTextFile be in sync  ?
>
> regards
> Rahul



-- 
Director of Data Science
Cloudera
Twitter: @josh_wills

Mime
View raw message