[ https://issues.apache.org/jira/browse/MAPREDUCE-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amareshwari Sriramadasu resolved MAPREDUCE-613.
-----------------------------------------------
Resolution: Duplicate
Can be achieved through skipping bad records feature.
> Streaming should allow to re-start the command if it failed in the middle of input
> ----------------------------------------------------------------------------------
>
> Key: MAPREDUCE-613
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-613
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: arkady borkovsky
>
> Sometimes, we need to use imperfect programs to process data.
> Recently, I used a public domain program that did what I needed, but crashed after processing
few million records (in my case, more than half of the mappers would succeed, with the rest
failing at different %%).
> It would be nice to be able to tell the Streaming Framework :
> if the streaming command fails at some input record (and you get "pipe broken" from
it),
> restart the command and continue feeding it the data.
> Please log the failing record.
> In textmining, quite often, loosing few record of the input makes no difference at all.
> Of course this feature should be disabled by default, and should some "are really sure"
provision. (an expert feature).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
|