crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinal Shah (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CRUNCH-361) Illegal State Exception
Date Thu, 27 Feb 2014 21:02:21 GMT

     [ https://issues.apache.org/jira/browse/CRUNCH-361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jinal Shah updated CRUNCH-361:
------------------------------

    Description: 
So apparently  I was trying to use the ParallelDoOption in order to tell the planner to do
something in a certain way. So when you pass the sourceTarget to it and do the union or co-group
in the steps following that on the PCollection that was generated it tries to find the size
of the parent source which is still not generated. Here are the steps to produce it

{code}
PCollection<U>  collection = afterSomeOperation();
SourceTarget<U> marker = new SourceTarget<U>(pathThatDoesNotExist); // this could
be any SourceTarget implementation
pipeline.write(collection, marker);
PCollection<U> collection2 = pipeline.read(marker);
PCollection<V> collection3 = collection2.parallelDo(DoFn,PType,ParallelDoOptions.builder().sources(marker).build());
doSomeMoreOperation();
PCollection<V> union = collection3.union(SomePCollectionOfV);
{code}

This will throw the exception since the union will not be able to find the size of the marker
since it is not generated yet. So the planner should know that the Source is not generated
yet and there is a job in the pipeline that will generate it.  

  was:
So apparently  I was trying to use the ParallelDoOption in order to tell the planner to do
something in a certain way. So when you pass the sourceTarget to it and do the union or co-group
in the steps following that on the PCollection that was generated it tries to find the size
of the parent source which is still not generated. Here are the steps to produce it

{code}
PCollection<U>  collection = afterSomeOperation();
SourceTarget<U> marker = new SourceTarget<U>(pathThatDoesNotExist); // this could
be any SourceTarget implementation
pipeline.write(collection, marker);
PCollection<U> collection2 = pipeline.read(marker);
PCollection<V> collection3 = collection2.parallelDo(DoFn,PType,ParallelDoOptions.builder().sources(marker).build());
doSomeMoreOperation();
PCollection<V> union = collection3.union(SomePCollectionOfV);
{code}
This will throw the exception since the union will not be able to find the size of the marker
since it is not generated yet. So the planner should know that the Source is not generated
yet and there is a job in the pipeline that will generate it.  


> Illegal State Exception
> -----------------------
>
>                 Key: CRUNCH-361
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-361
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.9.0, 0.8.2
>            Reporter: Jinal Shah
>            Assignee: Josh Wills
>            Priority: Minor
>
> So apparently  I was trying to use the ParallelDoOption in order to tell the planner
to do something in a certain way. So when you pass the sourceTarget to it and do the union
or co-group in the steps following that on the PCollection that was generated it tries to
find the size of the parent source which is still not generated. Here are the steps to produce
it
> {code}
> PCollection<U>  collection = afterSomeOperation();
> SourceTarget<U> marker = new SourceTarget<U>(pathThatDoesNotExist); // this
could be any SourceTarget implementation
> pipeline.write(collection, marker);
> PCollection<U> collection2 = pipeline.read(marker);
> PCollection<V> collection3 = collection2.parallelDo(DoFn,PType,ParallelDoOptions.builder().sources(marker).build());
> doSomeMoreOperation();
> PCollection<V> union = collection3.union(SomePCollectionOfV);
> {code}
> This will throw the exception since the union will not be able to find the size of the
marker since it is not generated yet. So the planner should know that the Source is not generated
yet and there is a job in the pipeline that will generate it.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message