crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Micah Whitacre (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CRUNCH-264) Writing to TextFileTarget map side does not show up in plan
Date Sat, 07 Sep 2013 03:24:51 GMT

     [ https://issues.apache.org/jira/browse/CRUNCH-264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Micah Whitacre updated CRUNCH-264:
----------------------------------

    Attachment: CRUNCH-264.png
                CRUNCH-264.txt

Here's the test I wrote to simulate the scenario we are seeing.

Also here is the rendered version of the DOT
                
> Writing to TextFileTarget map side does not show up in plan
> -----------------------------------------------------------
>
>                 Key: CRUNCH-264
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-264
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Micah Whitacre
>            Assignee: Josh Wills
>            Priority: Minor
>         Attachments: CRUNCH-264.png, CRUNCH-264.txt
>
>
> Creating a pipeline that writes out data to a TextFile (mapside) and then Avro (reduce
side), causes the text side write and any processing that might happen on that branch to not
show up in the the plan.
> Specifically the name of the pipeline is..
> Text(/simple.txt)+S0+[[S1+Text(/some/test/first)]/[S3]]+GBK+ungroup+PTables.values+Avro(/some/test/path)"
> However the generated DOT is:
> digraph G {
>   "Text(/simple.txt)" [label="Text(/simple.txt)" shape=folder];
>   "Avro(/some/test/path)" [label="Avro(/some/test/path)" shape=folder];
>   subgraph "cluster-job1" {
>     subgraph "cluster-job1-map" {
>       label = Map; color = blue;
>       "S3@2118275672@1822883541" [label="S3" shape=box];
>       "S0@875319338@1822883541" [label="S0" shape=box];
>     }
>     subgraph "cluster-job1-reduce" {
>       label = Reduce; color = red;
>       "GBK@221482301@1822883541" [label="GBK" shape=box];
>       "PTables.values@1156570456@1822883541" [label="PTables.values" shape=box];
>       "ungroup@1830236047@1822883541" [label="ungroup" shape=box];
>     }
>   }
>   "ungroup@1830236047@1822883541" -> "PTables.values@1156570456@1822883541";
>   "GBK@221482301@1822883541" -> "ungroup@1830236047@1822883541";
>   "PTables.values@1156570456@1822883541" -> "Avro(/some/test/path)";
>   "Text(/simple.txt)" -> "S0@875319338@1822883541";
>   "S3@2118275672@1822883541" -> "GBK@221482301@1822883541";
>   "S0@875319338@1822883541" -> "S3@2118275672@1822883541";
> }
> Which is missing "S1" and the writing to '/some/test/first'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message