flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: count + aggragation
Date Mon, 04 Sep 2017 14:54:25 GMT
Hi Alieh,

I'm not aware of a solution to the first problem, but for the second issue
you should use mayBy() instead of max().

Best, Fabian

2017-09-04 16:08 GMT+02:00 Alieh <saeedi@informatik.uni-leipzig.de>:

> Hello all,
> 1st question:
> Is there any way to know the count or the content of a "Fink DataSet"
> without using count() or collect()? The problem is that I have a loop which
> the number of iterations depends on the count of a DataSet. Using count()
> may force the whole pipeline to be executed again. I do not like to use
> delta or bulk iteration.
> 2nd question:
> Using the "Aggregations.Max" on a DataSet of Tuple2<String, Integer> on
> the second field, I observed that the second field is the real maximum of
> the whole dataset while the first field is not the corresponding one to the
> second!!!
> Best,
> Alieh

View raw message