flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: Wrong and non consistent behavior of max
Date Fri, 28 Nov 2014 15:32:05 GMT
No, I didn't say that I want to overload min/max. I think Viktor's changes
are exactly in line with what I said. min/max(X) or minBy/maxBy(X) could be
(maybe deprecated) shortcuts for aggregate(max/min(X)).

On Fri, Nov 28, 2014 at 2:59 PM, Fabian Hueske <fhueske@apache.org> wrote:

> I am not sure about this. With the new aggregations that Viktor is working
> on, things become pretty obvious IMO.
>
> data.aggregate(count(), sum(2), min(0));
> basically shows the structure of the result.
>
> I would not go and overload min() and max() in different contexts.
>
> 2014-11-28 14:53 GMT+01:00 Ufuk Celebi <uce@apache.org>:
>
>> This is not the first time that people confused this. I think most people
>> expect the maxBy and minBy behaviour for max/min.
>>
>> Maybe it makes sense to move back to the old aggregations API, where you
>> call the aggregate method and specify as an argument, which type of
>> aggregation should be performed. I didn't really like this, but if the
>> current state is confusing people, we should consider to change it again.
>>
>> On Fri, Nov 28, 2014 at 12:31 PM, Maximilian Alber <
>> alber.maximilian@gmail.com> wrote:
>>
>>> Hi Fabian!
>>>
>>> Ok, thanks! Now it works.
>>>
>>> Cheers,
>>> Max
>>>
>>> On Fri, Nov 28, 2014 at 1:47 AM, Fabian Hueske <fhueske@apache.org>
>>> wrote:
>>>
>>>> Hi Max,
>>>>
>>>> the max(i) function does not select the Tuple with the maximum value.
>>>> Instead, it builds a new Tuple with the maximum value for the i-th
>>>> attribute. The values of the Tuple's other fields are not defined (in
>>>> practice they are set to the value of the last Tuple, however the order of
>>>> Tuples is not defined).
>>>>
>>>> The Java API features minBy and maxBy transformations that should do
>>>> what you are looking for.
>>>> You can reimplement them for Scala as a simple GroupReduce (or Reduce)
>>>> function or use the Java function in you Scala code.
>>>>
>>>> Best, Fabian
>>>>
>>>>
>>>>
>>>> 2014-11-27 16:14 GMT+01:00 Maximilian Alber <alber.maximilian@gmail.com
>>>> >:
>>>>
>>>>> Hi Flinksters,
>>>>>
>>>>> I don't if I made something wrong, but the code seems fine. Basically
>>>>> the max function does extract a wrong element.
>>>>>
>>>>> The error does just happen with my real data, not if I inject some
>>>>> sequence into costs.
>>>>>
>>>>> The problem is that the according tuple value at position is wrong.
>>>>> The maximum of the second part is detected correctly.
>>>>>
>>>>> The code snippet:
>>>>>
>>>>> val maxCost = costs map {x => (x.id, x.value)} max(1)
>>>>>
>>>>> (costs map {x => (x.id, x.value)} map {_ toString} map {"first: "+
_
>>>>> }) union (maxCost map {_ toString} map {"second: "+ _ }) writeAsText
>>>>> config.outFile
>>>>>
>>>>> The output:
>>>>>
>>>>> File content:
>>>>> first: (47,42.066986)
>>>>> first: (11,4.448255)
>>>>> first: (40,42.06696)
>>>>> first: (3,0.96731037)
>>>>> first: (31,42.06443)
>>>>> first: (18,23.753584)
>>>>> first: (45,42.066986)
>>>>> first: (24,41.44347)
>>>>> first: (13,6.1290965)
>>>>> first: (19,26.42948)
>>>>> first: (1,0.9665109)
>>>>> first: (28,42.04222)
>>>>> first: (5,1.2986814)
>>>>> first: (44,42.066986)
>>>>> first: (7,1.8681992)
>>>>> first: (10,3.0981758)
>>>>> first: (41,42.066982)
>>>>> first: (48,42.066986)
>>>>> first: (21,33.698544)
>>>>> first: (38,42.066963)
>>>>> first: (30,42.06153)
>>>>> first: (26,41.950237)
>>>>> first: (43,42.066986)
>>>>> first: (16,14.754578)
>>>>> first: (15,10.571205)
>>>>> first: (34,42.06672)
>>>>> first: (29,42.055424)
>>>>> first: (35,42.066845)
>>>>> first: (8,1.9513339)
>>>>> first: (22,38.189228)
>>>>> first: (46,42.066986)
>>>>> first: (2,0.966511)
>>>>> first: (27,42.013676)
>>>>> first: (12,5.4271784)
>>>>> first: (42,42.066986)
>>>>> first: (4,1.01561)
>>>>> first: (14,7.4410205)
>>>>> first: (25,41.803535)
>>>>> first: (6,1.6827519)
>>>>> first: (36,42.06694)
>>>>> first: (20,28.834095)
>>>>> first: (32,42.06577)
>>>>> first: (49,42.066986)
>>>>> first: (33,42.0664)
>>>>> first: (9,2.2420964)
>>>>> first: (37,42.066967)
>>>>> first: (0,0.9665109)
>>>>> first: (17,19.016153)
>>>>> first: (39,42.06697)
>>>>> first: (23,40.512672)
>>>>> second: (23,42.066986)
>>>>>
>>>>> File content end.
>>>>>
>>>>>
>>>>> Thanks!
>>>>> Cheers,
>>>>> Max
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message