hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3) Output directories are not cleaned up before the reduces run
Date Fri, 10 Feb 2006 18:04:56 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-3?page=comments#action_12365934 ] 

Doug Cutting commented on HADOOP-3:

An even safer way to fix this would be to have JobClient throw an exception if the output
directory already exists.  That way folks won't inadvertantly overwrite things.

> Output directories are not cleaned up before the reduces run
> ------------------------------------------------------------
>          Key: HADOOP-3
>          URL: http://issues.apache.org/jira/browse/HADOOP-3
>      Project: Hadoop
>         Type: Bug
>   Components: mapred
>     Reporter: Owen O'Malley
>     Priority: Minor
>  Attachments: clean-out-dir.patch
> The output directory for the reduces is not cleaned up and therefore if you can see left
overs from previous runs, if they had more reduces. For example, if you run the application
once with reduces=10 and then rerun with reduces=8, your output directory will have frag00000
to frag00009 with the first 8 fragments from the second run and the last 2 fragments from
the first run.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message