flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chesnay Schepler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-671) Python interface for new API (Map/Reduce)
Date Wed, 16 Jul 2014 07:54:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063258#comment-14063258
] 

Chesnay Schepler commented on FLINK-671:
----------------------------------------

Finally found the issue: long strings (> 4000 bytes) are not properly read on the java
side. when i filter these out the WC runs fine.

I never checked how much data java actually reads, and only used a single call to read. since
at that time at most 4k bytes are present (size of the buffer behind standard pipes), it only
reads those and forgets about the rest. the next read call then reads data that wasn't supposed
to be there, generally breaking the program.

> Python interface for new API (Map/Reduce)
> -----------------------------------------
>
>                 Key: FLINK-671
>                 URL: https://issues.apache.org/jira/browse/FLINK-671
>             Project: Flink
>          Issue Type: Improvement
>          Components: Python API
>            Reporter: Chesnay Schepler
>            Assignee: Chesnay Schepler
>              Labels: github-import
>             Fix For: pre-apache
>
>         Attachments: pull-request-671-9139035883911146960.patch
>
>
> ([#615|https://github.com/stratosphere/stratosphere/issues/615] | [FLINK-615|https://issues.apache.org/jira/browse/FLINK-615])
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/pull/671
> Created by: [zentol|https://github.com/zentol]
> Labels: enhancement, java api, 
> Milestone: Release 0.6 (unplanned)
> Created at: Wed Apr 09 20:52:06 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message