flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephan Ewen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1268) FileOutputFormat with overwrite does not clear local output directories
Date Fri, 21 Nov 2014 11:00:34 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220774#comment-14220774

Stephan Ewen commented on FLINK-1268:

I agree, it would be useful to change that...

> FileOutputFormat with overwrite does not clear local output directories
> -----------------------------------------------------------------------
>                 Key: FLINK-1268
>                 URL: https://issues.apache.org/jira/browse/FLINK-1268
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Till Rohrmann
> I noticed that the FileOutputFormat does not clear the output directories if it writes
to local disk. This has the consequence that previous partitions are still contained in the
directory if one decreases the DOP between subsequent runs. If one reads the data from this
directory, then more partitions will be read in than were actually written. This can lead
to a wrong user code behaviour which is hard to debug. I'm aware that in case of a distributed
execution the TaskManagers or the Tasks have to be responsible for the cleanup and if multiple
Tasks are running on a TaskManager, then the cleanup has to be coordinated.

This message was sent by Atlassian JIRA

View raw message