[ https://issues.apache.org/jira/browse/MAPREDUCE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amareshwari Sriramadasu updated MAPREDUCE-1768:
-----------------------------------------------
Component/s: contrib/streaming
> Streaming command should be able to produced multiple outputs stored as separate DFS
data sets
> ----------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-1768
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1768
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: contrib/streaming
> Reporter: arkady borkovsky
>
> Some streaming commands in map or reduce phase, as a "side effect", produce several output
files.
> The names of output files may be hard coded, or specified on the command line.
> Streaming infrastructure should allow to get these files copied into DFS.
> For each distinct "output file name", a separate DFS dataset (DFS directory) should be
created, and a file of an individual task should be stored there as a part file. The names
of directories may be derived from the main "output name" (default)
> Related to https://issues.apache.org/jira/browse/HADOOP-2236: in case of reduce, a single
name output file may be seen as a special case if this.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
|