incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <>
Subject [jira] [Updated] (CRUNCH-88) Multiple parallelDos on a PGroupedTableImpl does not work
Date Thu, 04 Oct 2012 21:05:47 GMT


Gabriel Reid updated CRUNCH-88:

    Attachment: CRUNCH-88.patch

Patch which resolves the issue, including an integration test that demonstrates the issue.

I had hoped to correct this in the planner, but that is non-trivial because dependencies between
PCollectionImpls are set up during the building of the pipeline, and it appears that fixing
this in the planner requires reworking the structure of the pipeline graph quite a bit. However,
I think that this would be a good candidate once we start working on doing fusion optimizations
in the planner.

[~jwills] can you take a look at this? If you see a quick way to correct this in the planner
instead then that would be better, but I couldn't spot it.
> Multiple parallelDos on a PGroupedTableImpl does not work
> ---------------------------------------------------------
>                 Key: CRUNCH-88
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Gabriel Reid
>            Assignee: Gabriel Reid
>         Attachments: CRUNCH-88.patch
> Creating multiple distinct PCollections based on a single PGroupedTableImpl does not
work correctly - the content of the PGroupedTableImpl will only be sent to a single outgoing
PCollection, and all other PCollections that stem from the grouped table will not receive
any data.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message