flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michele Bertoni <michele1.bert...@mail.polimi.it>
Subject Re: inconsistency in count and print
Date Sat, 16 May 2015 09:52:43 GMT
Is it possible that is due to the hasher?

Inside my code i was using the google guava hasher (sha256 as a Long hash)
sometimes I got errors from it (ArrayOutOfBoundException) sometimes i just got different hash
for the same id, especially when running on an not-local execution environment

I removed it anywhere and I started using the java hashcode, now it is seems to work


> Il giorno 16/mag/2015, alle ore 09:15, Michele Bertoni <michele1.bertoni@mail.polimi.it>
ha scritto:
> 
> Hi,
> it is 2 days i am going mad with a problem, every time i run the code (on the same dataset)
i get a different result
> 
> while i was trying debugging i found this
> 
> i have this code
> 
> val aggregationResult  = //something that creates the dataset and uses join, group, reduce
and map
> logger.error("res count " + aggregationResult.count)
> aggregationResult.print
> 
> 
> 
> the logger prints a dataset size of 7
> the output result is made of 6 elements
> 
> this happens randomly sometimes the result is larger than the count and sometimes they
are both correct at 10
> 
> 
> 
> flink version 0.9milestone1
> 
> any idea of what can make it “not deterministic”?
> thanks for help

Mime
View raw message