flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasiliki Kalavri <vasilikikala...@gmail.com>
Subject Re: NullPointerException in DeltaIteration when no ForwardedFileds annotation
Date Mon, 27 Apr 2015 15:07:42 GMT
Will do, thanks!

On 27 April 2015 at 11:06, Fabian Hueske <fhueske@gmail.com> wrote:

> No, haven't looked at it since my last mail :-(
> Both plans (with and without forward fields annotation) look good except
> for the suspicious pipeline breaker.
>
> @Vasia Could you open a JIRA and assign it to me?
> I'll have a closer look and try to figure out what's going on.
>
>
> 2015-04-27 10:34 GMT+02:00 Stephan Ewen <sewen@apache.org>:
>
> > I think Fabian looked into this a while back...
> >
> > @Fabian, do you have any insights what causes this?
> >
> >
> > On Sat, Apr 25, 2015 at 7:46 PM, Vasiliki Kalavri <
> > vasilikikalavri@gmail.com
> > > wrote:
> >
> > > Hi,
> > >
> > > I actually ran into this problem again with a different algorithm :/
> > > Same exception and it looks like getMatchFor() in CompactingHashTable
> > > returns a null record.
> > > Not sure why or why the annotation prevents this from happening. Any
> > > insight is highly welcome :-)
> > >
> > > Shall I open an issue so that we don't forget about this?
> > >
> > > -Vasia.
> > >
> > >
> > > On 4 April 2015 at 14:44, Vasiliki Kalavri <vasilikikalavri@gmail.com>
> > > wrote:
> > >
> > > > Hi Fabian,
> > > >
> > > > thanks for looking into this.
> > > > Let me know if there's anything I can do to help!
> > > >
> > > > Cheers,
> > > > V.
> > > >
> > > > On 3 April 2015 at 22:31, Fabian Hueske <fhueske@gmail.com> wrote:
> > > >
> > > >> Thanks for the nice setup!
> > > >> I could easily reproduce the exception you are facing.
> > > >> But that's the only good news so far :-(
> > > >>
> > > >> I checked the plans and both are valid and should compute the
> correct
> > > >> result for the program.
> > > >> The split-of solution set delta is required because the it needs to
> be
> > > >> repartitioned (without the annotation, the optimizer does not know
> > that
> > > it
> > > >> is in fact already correctly partitioned). One thing that made me
a
> > bit
> > > >> suspicious is that the solution set delta partitioning is marked
> with
> > a
> > > >> Pipeline-Breaker. The pipeline breaker shouldn't make a semantic
> > > >> difference, but I am not sure if it is really required and also that
> > > part
> > > >> of the codebase was recently worked on.
> > > >>
> > > >> So, a closer look and more debugging is necessary to figure out what
> > not
> > > >> working correctly here...
> > > >>
> > > >>
> > > >> 2015-04-03 14:14 GMT+02:00 Vasiliki Kalavri <
> > vasilikikalavri@gmail.com
> > > >:
> > > >>
> > > >> > Hi Fabian,
> > > >> >
> > > >> > I am using the dblp co-authorship dataset from SNAP:
> > > >> > http://snap.stanford.edu/data/com-DBLP.html
> > > >> > I also pushed my slightly modified version of ConnectedComponents,
> > > here:
> > > >> > https://github.com/vasia/flink/tree/cc-test. It basically
> generates
> > > the
> > > >> > vertex dataset from the edges, so that you don't need to create
it
> > > >> > separately.
> > > >> > The annotation that creates the error is in line #172.
> > > >> >
> > > >> > Thanks a lot :))
> > > >> >
> > > >> > -Vasia.
> > > >> >
> > > >> >
> > > >> > On 3 April 2015 at 13:09, Fabian Hueske <fhueske@gmail.com>
> wrote:
> > > >> >
> > > >> > > That looks pretty much like a bug.
> > > >> > >
> > > >> > > As you said, fwd fields annotations are optional and may
improve
> > the
> > > >> > > performance of a program, but never change its semantics
(if set
> > > >> > > correctly).
> > > >> > >
> > > >> > > I'll have a look at it later.
> > > >> > > Would be great if you could provide some data to reproduce
the
> > bug.
> > > >> > > On Apr 3, 2015 12:48 PM, "Vasiliki Kalavri" <
> > > >> vasilikikalavri@gmail.com>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hello to my squirrels,
> > > >> > > >
> > > >> > > > I've been getting a NullPointerException for a DeltaIteration
> > > >> program
> > > >> > I'm
> > > >> > > > trying to implement and I could really use your help
:-)
> > > >> > > > It seems that some of the input Tuples of the Join
operator
> that
> > > I'm
> > > >> > > using
> > > >> > > > to create the next workset / solution set delta are
null.
> > > >> > > > It also seems that adding ForwardedFields annotations
solves
> the
> > > >> issue.
> > > >> > > >
> > > >> > > > I managed to reproduce the behavior using the
> > ConnectedComponents
> > > >> > > example,
> > > >> > > > by removing the "@ForwardedFieldsFirst("*")" annotation
from
> > > >> > > > the ComponentIdFilter join.
> > > >> > > > The exception message is the following:
> > > >> > > >
> > > >> > > > Caused by: java.lang.NullPointerException
> > > >> > > > at
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.flink.examples.java.graph.ConnectedComponents$ComponentIdFilter.join(ConnectedComponents.java:186)
> > > >> > > > at
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.flink.examples.java.graph.ConnectedComponents$ComponentIdFilter.join(ConnectedComponents.java:1)
> > > >> > > > at
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.flink.runtime.operators.JoinWithSolutionSetSecondDriver.run(JoinWithSolutionSetSecondDriver.java:198)
> > > >> > > > at
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
> > > >> > > > at
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.flink.runtime.iterative.task.AbstractIterativePactTask.run(AbstractIterativePactTask.java:139)
> > > >> > > > at
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.flink.runtime.iterative.task.IterationIntermediatePactTask.run(IterationIntermediatePactTask.java:92)
> > > >> > > > at
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
> > > >> > > > at
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
> > > >> > > > at java.lang.Thread.run(Thread.java:745)
> > > >> > > >
> > > >> > > > I get this error locally with any sufficiently big
dataset
> > (~10000
> > > >> > > nodes).
> > > >> > > > When the annotation is in place, it works without problem.
> > > >> > > > I also generated the optimizer plans for the two cases:
> > > >> > > > - with annotation (working):
> > > >> > > > https://gist.github.com/vasia/4f4dc6b0cc6c72b5b64b
> > > >> > > > - without annotation (failing):
> > > >> > > > https://gist.github.com/vasia/086faa45b980bf7f4c09
> > > >> > > >
> > > >> > > > After visualizing the plans, the main difference I
see is that
> > in
> > > >> the
> > > >> > > > working case, the next workset node and the solution
set delta
> > > nodes
> > > >> > are
> > > >> > > > merged, while in the failing case they are separate.
> > > >> > > >
> > > >> > > > Shouldn't this work with and without annotation (but
be more
> > > >> efficient
> > > >> > > with
> > > >> > > > the annotation in place)? Or am I missing something
here?
> > > >> > > >
> > > >> > > > Thanks in advance for any help :))
> > > >> > > >
> > > >> > > > Cheers,
> > > >> > > > - Vasia.
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message