crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-506) Default To.textFile to use TextFileSourceTarget
Date Wed, 01 Apr 2015 22:48:54 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14391662#comment-14391662
] 

Josh Wills commented on CRUNCH-506:
-----------------------------------

+1, seems like the right thing to do.

> Default To.textFile to use TextFileSourceTarget
> -----------------------------------------------
>
>                 Key: CRUNCH-506
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-506
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.11.0
>            Reporter: Micah Whitacre
>            Assignee: Micah Whitacre
>
> Had a consumer with an interesting situation.  They had code like the following:
> {code}
> PCollection<String> output = ...
> output.write(To.textFile(path));
> pipeline.done();
> long size = output.length().getValue();
> {code}
> This code was actually failing with an exception like the following:
> {noformat}
> Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.JavaMain], main()
threw exception, org.apache.crunch.CrunchRuntimeException: java.io.IOException: No files found
to materialize at: /tmp/crunch-107739816/p8
>   org.apache.oozie.action.hadoop.JavaMainException: org.apache.crunch.CrunchRuntimeException:
java.io.IOException: No files found to materialize at: /tmp/crunch-107739816/p8
>   at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:58)
>   at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
> {noformat}
> I believe this is because the To.textFile(...) uses just TextFileTarget.  So the length()
call is going back to the intermediate state that got cleaned up by the done() call.  Switching
the To.textFile(..) to TextFileSourceTarget instead actually lets the code succeed.  
> Seems like we could switch the To.textFile(..) to use the SourceTarget impl to make this
less surprising/confusing to consumers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message