flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From StephanEwen <...@git.apache.org>
Subject [GitHub] incubator-flink pull request: New operator map partition function
Date Thu, 03 Jul 2014 11:46:58 GMT
Github user StephanEwen commented on a diff in the pull request:

    https://github.com/apache/incubator-flink/pull/42#discussion_r14509280
  
    --- Diff: stratosphere-java/src/main/java/eu/stratosphere/api/java/DataSet.java ---
    @@ -135,6 +139,27 @@ public ExecutionEnvironment getExecutionEnvironment() {
     		}
     		return new MapOperator<T, R>(this, mapper);
     	}
    +
    +
    +
    +    /**
    +     * Applies a Map transformation on a {@link DataSet} by using an iterator.<br/>
    --- End diff --
    
    I think this comment is not quite correct. Something more appropriate is 
    
    ```
    Applies a Map operation to the entire partition of the data. The function is called once
per parallel partition of the data, and the entire partition is available through the given
Iterator. The number of elements that each instance of the MapPartition function sees is non
deterministic and depends on the degree of parallelism of the operation.
    
    This function is intended for operations that cannot transform individual elements, requires
no grouping of elements. To transform individual elements, the use of {@code map()} and {@code
flatMap()} is preferable."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message