crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ioan Marius Curelariu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CRUNCH-390) Planner is not adding dependencies between jobs when planning is done in more than one stage.
Date Wed, 07 May 2014 07:43:05 GMT
Ioan Marius Curelariu created CRUNCH-390:
--------------------------------------------

             Summary: Planner is not adding dependencies between jobs when planning is done
in more than one stage.
                 Key: CRUNCH-390
                 URL: https://issues.apache.org/jira/browse/CRUNCH-390
             Project: Crunch
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.8.2
            Reporter: Ioan Marius Curelariu
            Assignee: Josh Wills


The planner splits does the planning in multiple stages when it finds job dependencies on
ReadableData. One example of this case is when using the BloomFilterJoinStrategy.
While the generated plan dot file looks good, the planner actually does not add dependencies
between jobs that are created in different planning stages.
I have a pipeline that reads 3 input sources. It joins 2 of them using a bloom filter join
strategy. Later on, it joins this with the output of a job coming from the third source path.
In the case the jobs on the branch using the bloom filter finish before the one reading the
third source, the executor attempts to start the 4-th job that is supposed to join everything
before the 3-rd one finish, resulting in a input Path not found exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message