flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chesnay Schepler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-671) Python interface for new API (Map/Reduce)
Date Mon, 28 Jul 2014 14:41:39 GMT

    [ https://issues.apache.org/jira/browse/FLINK-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076264#comment-14076264

Chesnay Schepler commented on FLINK-671:

Status Update:
fixed a few things with [~qmlmoon] help.

The following tasks should be resolved next:
- memory-mapped files (IO operations currently take up 50% of the total time)
- iteration support (difficulty mostly depends on how easy it is to implement an accumulator
in python / sync it with the java counter-part)

one thing i would like, but don't know *at all* how feasible it is, nor how much the payoff
would be: alternate versions for all java operators that provide an iterator/collector

the reasoning is that currently the processes wait quite some time for the next value to arrive.

for a map operation, the python process waits until java has read the result, returned it,
got the next value and wrote it to the stream. 

this would have the following effects:
- we could load several values into the stream, reducing this waiting time. 
- it allows us to write values in batches, reducing the number of IO calls. 

> Python interface for new API (Map/Reduce)
> -----------------------------------------
>                 Key: FLINK-671
>                 URL: https://issues.apache.org/jira/browse/FLINK-671
>             Project: Flink
>          Issue Type: Improvement
>          Components: Python API
>            Reporter: Chesnay Schepler
>            Assignee: Chesnay Schepler
>              Labels: github-import
>             Fix For: pre-apache
>         Attachments: pull-request-671-9139035883911146960.patch
> ([#615|https://github.com/stratosphere/stratosphere/issues/615] | [FLINK-615|https://issues.apache.org/jira/browse/FLINK-615])
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/pull/671
> Created by: [zentol|https://github.com/zentol]
> Labels: enhancement, java api, 
> Milestone: Release 0.6 (unplanned)
> Created at: Wed Apr 09 20:52:06 CEST 2014
> State: open

This message was sent by Atlassian JIRA

View raw message