flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@apache.org>
Subject Re: Wrong and non consistent behavior of max
Date Fri, 28 Nov 2014 13:59:07 GMT
I am not sure about this. With the new aggregations that Viktor is working
on, things become pretty obvious IMO.

data.aggregate(count(), sum(2), min(0));
basically shows the structure of the result.

I would not go and overload min() and max() in different contexts.

2014-11-28 14:53 GMT+01:00 Ufuk Celebi <uce@apache.org>:

> This is not the first time that people confused this. I think most people
> expect the maxBy and minBy behaviour for max/min.
>
> Maybe it makes sense to move back to the old aggregations API, where you
> call the aggregate method and specify as an argument, which type of
> aggregation should be performed. I didn't really like this, but if the
> current state is confusing people, we should consider to change it again.
>
> On Fri, Nov 28, 2014 at 12:31 PM, Maximilian Alber <
> alber.maximilian@gmail.com> wrote:
>
>> Hi Fabian!
>>
>> Ok, thanks! Now it works.
>>
>> Cheers,
>> Max
>>
>> On Fri, Nov 28, 2014 at 1:47 AM, Fabian Hueske <fhueske@apache.org>
>> wrote:
>>
>>> Hi Max,
>>>
>>> the max(i) function does not select the Tuple with the maximum value.
>>> Instead, it builds a new Tuple with the maximum value for the i-th
>>> attribute. The values of the Tuple's other fields are not defined (in
>>> practice they are set to the value of the last Tuple, however the order of
>>> Tuples is not defined).
>>>
>>> The Java API features minBy and maxBy transformations that should do
>>> what you are looking for.
>>> You can reimplement them for Scala as a simple GroupReduce (or Reduce)
>>> function or use the Java function in you Scala code.
>>>
>>> Best, Fabian
>>>
>>>
>>>
>>> 2014-11-27 16:14 GMT+01:00 Maximilian Alber <alber.maximilian@gmail.com>
>>> :
>>>
>>>> Hi Flinksters,
>>>>
>>>> I don't if I made something wrong, but the code seems fine. Basically
>>>> the max function does extract a wrong element.
>>>>
>>>> The error does just happen with my real data, not if I inject some
>>>> sequence into costs.
>>>>
>>>> The problem is that the according tuple value at position is wrong. The
>>>> maximum of the second part is detected correctly.
>>>>
>>>> The code snippet:
>>>>
>>>> val maxCost = costs map {x => (x.id, x.value)} max(1)
>>>>
>>>> (costs map {x => (x.id, x.value)} map {_ toString} map {"first: "+ _
>>>> }) union (maxCost map {_ toString} map {"second: "+ _ }) writeAsText
>>>> config.outFile
>>>>
>>>> The output:
>>>>
>>>> File content:
>>>> first: (47,42.066986)
>>>> first: (11,4.448255)
>>>> first: (40,42.06696)
>>>> first: (3,0.96731037)
>>>> first: (31,42.06443)
>>>> first: (18,23.753584)
>>>> first: (45,42.066986)
>>>> first: (24,41.44347)
>>>> first: (13,6.1290965)
>>>> first: (19,26.42948)
>>>> first: (1,0.9665109)
>>>> first: (28,42.04222)
>>>> first: (5,1.2986814)
>>>> first: (44,42.066986)
>>>> first: (7,1.8681992)
>>>> first: (10,3.0981758)
>>>> first: (41,42.066982)
>>>> first: (48,42.066986)
>>>> first: (21,33.698544)
>>>> first: (38,42.066963)
>>>> first: (30,42.06153)
>>>> first: (26,41.950237)
>>>> first: (43,42.066986)
>>>> first: (16,14.754578)
>>>> first: (15,10.571205)
>>>> first: (34,42.06672)
>>>> first: (29,42.055424)
>>>> first: (35,42.066845)
>>>> first: (8,1.9513339)
>>>> first: (22,38.189228)
>>>> first: (46,42.066986)
>>>> first: (2,0.966511)
>>>> first: (27,42.013676)
>>>> first: (12,5.4271784)
>>>> first: (42,42.066986)
>>>> first: (4,1.01561)
>>>> first: (14,7.4410205)
>>>> first: (25,41.803535)
>>>> first: (6,1.6827519)
>>>> first: (36,42.06694)
>>>> first: (20,28.834095)
>>>> first: (32,42.06577)
>>>> first: (49,42.066986)
>>>> first: (33,42.0664)
>>>> first: (9,2.2420964)
>>>> first: (37,42.066967)
>>>> first: (0,0.9665109)
>>>> first: (17,19.016153)
>>>> first: (39,42.06697)
>>>> first: (23,40.512672)
>>>> second: (23,42.066986)
>>>>
>>>> File content end.
>>>>
>>>>
>>>> Thanks!
>>>> Cheers,
>>>> Max
>>>>
>>>>
>>>
>>
>

Mime
View raw message