hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Moved: (MAPREDUCE-1768) Streaming command should be able to produced multiple outputs stored as separate DFS data sets
Date Fri, 07 May 2010 05:15:48 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Amareshwari Sriramadasu moved HADOOP-2237 to MAPREDUCE-1768:
------------------------------------------------------------

    Project: Hadoop Map/Reduce  (was: Hadoop Common)
        Key: MAPREDUCE-1768  (was: HADOOP-2237)

> Streaming command should be able to produced multiple outputs stored as separate DFS
data sets
> ----------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1768
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1768
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/streaming
>            Reporter: arkady borkovsky
>
> Some streaming commands in map or reduce phase, as a "side effect", produce several output
files.
> The names of output files may be hard coded, or specified on the command line.
> Streaming infrastructure should allow to get these files copied into DFS. 
> For each distinct "output file name", a separate DFS dataset (DFS directory) should be
created, and a file of an individual task should be stored there as a part file.   The names
of directories may be derived from the main "output name" (default)
> Related to https://issues.apache.org/jira/browse/HADOOP-2236:  in case of reduce, a single
name output file may be seen as a special case if this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message