crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Micah Whitacre (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-264) Writing to TextFileTarget map side does not show up in plan
Date Sat, 07 Sep 2013 22:46:51 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761139#comment-13761139
] 

Micah Whitacre commented on CRUNCH-264:
---------------------------------------

The test I wrote is a little janky but was mostly setup to represent the DAG of the real code.
 In the real code the writing of the text file works correctly.

Also the writing of the text file shows up in the job name "Text(/some/test/first)" it however
does not show up in the DOT diagram.  Not explicitly called out the "S1" processing step is
also missing from the diagram as well (but I believe that is an internal detail of writing
to a text file).
                
> Writing to TextFileTarget map side does not show up in plan
> -----------------------------------------------------------
>
>                 Key: CRUNCH-264
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-264
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Micah Whitacre
>            Assignee: Josh Wills
>            Priority: Minor
>         Attachments: CRUNCH-264.png, CRUNCH-264.txt
>
>
> Creating a pipeline that writes out data to a TextFile (mapside) and then Avro (reduce
side), causes the text side write and any processing that might happen on that branch to not
show up in the the plan.
> Specifically the name of the pipeline is..
> Text(/simple.txt)+S0+[[S1+Text(/some/test/first)]/[S3]]+GBK+ungroup+PTables.values+Avro(/some/test/path)"
> However the generated DOT is:
> digraph G {
>   "Text(/simple.txt)" [label="Text(/simple.txt)" shape=folder];
>   "Avro(/some/test/path)" [label="Avro(/some/test/path)" shape=folder];
>   subgraph "cluster-job1" {
>     subgraph "cluster-job1-map" {
>       label = Map; color = blue;
>       "S3@2118275672@1822883541" [label="S3" shape=box];
>       "S0@875319338@1822883541" [label="S0" shape=box];
>     }
>     subgraph "cluster-job1-reduce" {
>       label = Reduce; color = red;
>       "GBK@221482301@1822883541" [label="GBK" shape=box];
>       "PTables.values@1156570456@1822883541" [label="PTables.values" shape=box];
>       "ungroup@1830236047@1822883541" [label="ungroup" shape=box];
>     }
>   }
>   "ungroup@1830236047@1822883541" -> "PTables.values@1156570456@1822883541";
>   "GBK@221482301@1822883541" -> "ungroup@1830236047@1822883541";
>   "PTables.values@1156570456@1822883541" -> "Avro(/some/test/path)";
>   "Text(/simple.txt)" -> "S0@875319338@1822883541";
>   "S3@2118275672@1822883541" -> "GBK@221482301@1822883541";
>   "S0@875319338@1822883541" -> "S3@2118275672@1822883541";
> }
> Which is missing "S1" and the writing to '/some/test/first'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message