hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
Date Fri, 07 Aug 2009 11:20:14 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740509#action_12740509
] 

Amareshwari Sriramadasu commented on MAPREDUCE-370:
---------------------------------------------------

bq. To achieve this, I think we could port MultipleOutputs, and change the semantics of getCollector()
in the multi name case, so that the multi name is the full name of the name of the output
file. This method is typically invoked in the reduce() method, where the key and value are
available, and can be used to form the name.
Tom, are you saying that we should not have a protected method to generateOutputName(), which
could be overridden to give the functionality. If so, we should have a way to find out whether
it is namedOutput (i meant multiNamedOutputs) or an arbitrary name, to know which output format
should be used for writing.
We should have something like :
{code}
  public <K,V> void write(String namedOutput, String outputPath, K key, V value)
          throws IOException, InterruptedException;
  public <K,V> void write(String outputPath, K key, V value)
          throws IOException, InterruptedException;
{code}

bq. Applications that want to add a unique suffix can call FileOutputFormat#getUniqueFile()
themselves.
This should be done by the framework to support counters as  explained earlier.

> Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-370
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>         Attachments: patch-370.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message