flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ron Crocker <rcroc...@newrelic.com>
Subject Re: Fold vs Reduce in DataStream API
Date Thu, 19 Nov 2015 19:15:39 GMT
Thanks Stephan, that helps quite a bit. Looks like another one of those API
changes that I'll be struggling with for a little bit.

On Thu, Nov 19, 2015 at 10:40 AM, Stephan Ewen <sewen@apache.org> wrote:

> Hi Ron!
>
> You are right, there is a copy/paste error in the docs, it should be a
> FoldFunction that is passed to fold(), not a ReduceFunction.
>
> In Flink-0.10, the FoldFunction is only available on
>
>   - KeyedStream (
> https://ci.apache.org/projects/flink/flink-docs-release-0.10/api/java/org/apache/flink/streaming/api/datastream/KeyedStream.html#fold(R,%20org.apache.flink.api.common.functions.FoldFunction)
> )
>
>   - WindowedStream (
> https://ci.apache.org/projects/flink/flink-docs-release-0.10/api/java/org/apache/flink/streaming/api/datastream/WindowedStream.html#fold(R,%20org.apache.flink.api.common.functions.FoldFunction,%20org.apache.flink.api.common.typeinfo.TypeInformation)
> )
>
> In most cases, you probably want the variant on the WindowedStream, if you
> aggregate values over time.
>
> --------------------------------------------------------
>
> To the difference between fold() and reduce(): It is very subtle. The fold
> function can also convert to another type whenever it integrates a new
> element.
>
> Here is an example (with lists, not streams, but same principle).
>
> --------------------------------------------------------
>
> ReduceFunction<Integer> {
>
>   public Integer reduce(Integer a, Integer b) { return a + b; }
> }
>
> [1, 2, 3, 4, 5] -> reduce()  means: ((((1 + 2) + 3) + 4) + 5) = 15
>
> --------------------------------------------------------
>
> FoldFunction<String, Integer> {
>
>   public String fold(String current, Integer i) { return current +
> String.valueOf(i); }
> }
>
> [1, 2, 3, 4, 5] -> fold("start-")  means: ((((("start-" + 1) + 2) + 3) +
> 4) + 5) = "start-12345" (as a String)
>
>
> I hope that example illustrates the difference.
>
>
> Greetings,
> Stephan
>
>
> On Thu, Nov 19, 2015 at 7:00 PM, Ron Crocker <rcrocker@newrelic.com>
> wrote:
>
>> Hi Fabian -
>>
>> Thanks Fabian, that is a helpful description.
>>
>> That document WAS my source of information and it seems to also be the
>> source of my confusion. Further, it appears to be wrong - there is a
>> FoldFunction<O,T> (
>> https://ci.apache.org/projects/flink/flink-docs-release-0.10/api/java/org/apache/flink/api/common/functions/FoldFunction.html)
>> that should be passed into fold()?
>>
>> Separate note: fold() doesn't appear in the javadocs for 0.10.0
>> DataStream (see
>> https://ci.apache.org/projects/flink/flink-docs-release-0.10/api/java/org/apache/flink/streaming/api/datastream/DataStream.html).
>> So this made me look in the freshly-downloaded flink-streaming-java:0.10.0
>> and fold() does not appear in org
>> .apache.flink.streaming.api.datastream.DataStream either. Am I looking
>> in the wrong place for it? In 0.9.1, it's located in that same class with
>> this signature: fold(R initialValue, FoldFunction<OUT, R> folder).
>>
>> Ron
>>
>> On Wed, Nov 18, 2015 at 9:39 AM, Fabian Hueske <fhueske@gmail.com> wrote:
>>
>>> Hi Ron,
>>>
>>> Have you checked:
>>> https://ci.apache.org/projects/flink/flink-docs-release-0.10/apis/streaming_guide.html#transformations
>>> ?
>>>
>>> Fold is like reduce, except that you define a start element (of a
>>> different type than the input type) and the result type is the type of the
>>> initial value. In reduce, the result type must be identical to the input
>>> type.
>>>
>>> Best, Fabian
>>>
>>> 2015-11-18 18:32 GMT+01:00 Ron Crocker <rcrocker@newrelic.com>:
>>>
>>>> Is there a succinct description of the distinction between these
>>>> transforms?
>>>>
>>>
>> --
>> Ron Crocker
>> Principal Software Engineer
>> ( ( •)) New Relic
>> rcrocker@newrelic.com
>> M: +1 630 363 8835
>>
>
>


-- 
Ron Crocker
Principal Software Engineer
( ( •)) New Relic
rcrocker@newrelic.com
M: +1 630 363 8835

Mime
View raw message