flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fabian Hueske (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-970) Implement a first(n) operator
Date Mon, 28 Jul 2014 13:46:40 GMT

    [ https://issues.apache.org/jira/browse/FLINK-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076227#comment-14076227
] 

Fabian Hueske commented on FLINK-970:
-------------------------------------

Almost right ;-)

You can (and should) always use a combiner, also for the AllReduce / first(n) on DataSet.
Each partition is reduced to n elements by the combiner and dop*n elements are shipped to
a single partition where the reducer emits the first n element.

Btw. in FirstReducer you can simply define combine as

{code}
public void combine(Iterator<T> values, Collector<T> out) throws Exception {
  reduce(values, out);
}
{code}

> Implement a first(n) operator
> -----------------------------
>
>                 Key: FLINK-970
>                 URL: https://issues.apache.org/jira/browse/FLINK-970
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: Timo Walther
>            Assignee: Chesnay Schepler
>            Priority: Minor
>
> It is only syntactic sugar, but I had many cases where I just needed the first element
 or the first 2 elements in a GroupReduce.
> E.g. Instead of
> {code:java}
> .reduceGroup(new GroupReduceFunction<String, String>() {
> 					@Override
> 					public void reduce(Iterator<String> values, Collector<String> out) throws
Exception {
> 						out.collect(values.next());
> 					}
> 				})
> {code}
> {code:java}
> .first()
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message