flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: CompactingHashTable question
Date Wed, 16 Sep 2015 15:33:58 GMT
Yes, probing the HashTable with a key that does not exist will yield a join
function call with a null value (or empty iterator in case of CoGroup).

The semantics of the join are the same regardless of the hash table
implementation.
The fact that the error only occurs with the managed HT, indicates that
there is a bug somewhere :-(

2015-09-16 17:26 GMT+02:00 Vasiliki Kalavri <vasilikikalavri@gmail.com>:

> Hi,
>
> thanks a lot Fabian!
>
> I didn't know that join with the solution set is an outer join. That's a
> surprise :)
>
> So, if I understand correctly, I should have a null value when my other
> input to the join contains some key that doesn't exist in the solution set,
> right? That's not the case in my application; I'm not generating any new
> keys.
>
> Also, when setting the solutionSetUnManaged option, the exception doesn't
> occur anymore. Are the join semantics different when the solution set is in
> unmanaged memory?
>
> Cheers,
> Vasia.
>
>
> On 16 September 2015 at 16:50, Fabian Hueske <fhueske@gmail.com> wrote:
>
> > Hi Vasia,
> >
> > I looked into the code. A serializer should never return null when
> > deserializing. Either it does not detect that something went wrong with
> the
> > deserialization or it should throw an exception.
> >
> > Regarding the handling of null returns in the Drivers. If there is no
> entry
> > in the HT for a certain key, the HT will return null which is expected.
> > If a CoGroupWithSolutionSet*Driver receives a null value, it gives an
> empty
> > iterator to the user function. The JoinWithSolutionSet*Driver calls the
> > join function with a null value. Both behaviors are expected. A join
> with a
> > solution set is actually an outer join and a join function in such a join
> > needs to be able to handle null values on the solution set side.
> >
> > Cheers, Fabian
> >
> >
> > 2015-09-15 17:41 GMT+02:00 Vasiliki Kalavri <vasilikikalavri@gmail.com>:
> >
> > > Hello to my squirrels,
> > >
> > > I ran into an NPE for some iterations code and it looks like what's
> > > described in FLINK-2443 <
> > https://issues.apache.org/jira/browse/FLINK-2443
> > > >.
> > > I'm trying to understand the problem and I could really use your help
> :)
> > >
> > > So far, it seems that the exception is caused by a null value returned
> by
> > > CompactingHashTable.*getMatchFor*(PT probeSideRecord).
> > >
> > > This method returns null in the following cases:
> > > - when the hash table is "closed"
> > > - when the segment is done
> > > - if the serializer actually returns a null record
> > >
> > > It seems that on the join/cogroup driver side there is no check or
> > special
> > > handling when the build side record is null, i.e. the null record is
> > still
> > > passed to the join function.
> > > Is this correct and if not, what should the driver do in this case?
> > >
> > > Thank you!
> > >
> > > Cheers,
> > > Vasia.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message