hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
Date Fri, 21 Aug 2009 15:52:14 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746023#action_12746023
] 

Tom White commented on MAPREDUCE-370:
-------------------------------------

This is looking good.

I think that there should be the ability to have complete control over the output filename,
much as MultipleOutputFormat does. To achieve this we could change the baseOutputPath parameter
in the write methods to be a full output path. The user application would be reponsible for
making sure there are no name clashes - this is like the functionality available in MultipleOutputFormat
today. The overloaded version is available if the user doesn't care so much about the output
filenames, which will then have a {m,r}-nnnnn suffix. Does this make sense?

Also, we should find a way of not exposing MultipleOutputs#checkBaseOutputPath() and checkTokenName()
as public methods since they are only needed internally by the framework.

> Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-370
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-370-1.txt, patch-370.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message