flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: inconsistency in count and print
Date Sat, 16 May 2015 10:00:57 GMT
Invalid hash values can certainly cause non-deterministic results.

Can you provide a code snippet that shows how and where you used the Guava
Hasher?

2015-05-16 11:52 GMT+02:00 Michele Bertoni <michele1.bertoni@mail.polimi.it>
:

> Is it possible that is due to the hasher?
>
> Inside my code i was using the google guava hasher (sha256 as a Long hash)
> sometimes I got errors from it (ArrayOutOfBoundException) sometimes i just
> got different hash for the same id, especially when running on an not-local
> execution environment
>
> I removed it anywhere and I started using the java hashcode, now it is
> seems to work
>
>
> > Il giorno 16/mag/2015, alle ore 09:15, Michele Bertoni <
> michele1.bertoni@mail.polimi.it> ha scritto:
> >
> > Hi,
> > it is 2 days i am going mad with a problem, every time i run the code
> (on the same dataset) i get a different result
> >
> > while i was trying debugging i found this
> >
> > i have this code
> >
> > val aggregationResult  = //something that creates the dataset and uses
> join, group, reduce and map
> > logger.error("res count " + aggregationResult.count)
> > aggregationResult.print
> >
> >
> >
> > the logger prints a dataset size of 7
> > the output result is made of 6 elements
> >
> > this happens randomly sometimes the result is larger than the count and
> sometimes they are both correct at 10
> >
> >
> >
> > flink version 0.9milestone1
> >
> > any idea of what can make it “not deterministic”?
> > thanks for help
>
>

Mime
View raw message