hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5889) Allow writing to output directories that exist, as long as they are empty
Date Mon, 24 Aug 2009 14:11:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746884#action_12746884
] 

Tom White commented on HADOOP-5889:
-----------------------------------

I don't see why we wouldn't make this change to both old and new APIs. 

There is a precedent for having the same test for the old and new APIs (e.g. the one for LazyOutput),
so yes, I would create a new org.apache.hadoop.mapreduce.lib.output.TestFileOutputFormat.

> Allow writing to output directories that exist, as long as they are empty
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-5889
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5889
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5889-0.patch
>
>
> The current behavior in FileOutputFormat.checkOutputSpecs is to fail if the path specified
by mapred.output.dir exists at the start of the job. This is to protect from accidentally
overwriting existing data. There seems no harm then in slightly relaxing this check to allow
the case for the output to exist if it is an empty directory.
> At a minimum this would allow outputting to the root of S3N buckets, which is currently
impossible (https://issues.apache.org/jira/browse/HADOOP-5805).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message