crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-390) Planner is not adding dependencies between jobs when planning is done in more than one stage.
Date Sat, 10 May 2014 22:08:55 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13992486#comment-13992486
] 

Josh Wills commented on CRUNCH-390:
-----------------------------------

[~cmarius] good looking patch, thank you so much! I'm running it through integration tests
now and will commit it when it passes.

> Planner is not adding dependencies between jobs when planning is done in more than one
stage.
> ---------------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-390
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-390
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.2
>            Reporter: Ioan Marius Curelariu
>            Assignee: Josh Wills
>         Attachments: 0001-Patched-the-MSCRPlanner-to-correctly-add-dependencie.patch
>
>
> The planner splits does the planning in multiple stages when it finds job dependencies
on ReadableData. One example of this case is when using the BloomFilterJoinStrategy.
> While the generated plan dot file looks good, the planner actually does not add dependencies
between jobs that are created in different planning stages.
> I have a pipeline that reads 3 input sources. It joins 2 of them using a bloom filter
join strategy. Later on, it joins this with the output of a job coming from the third source
path.
> In the case the jobs on the branch using the bloom filter finish before the one reading
the third source, the executor attempts to start the 4-th job that is supposed to join everything
before the 3-rd one finish, resulting in a input Path not found exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message