hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjay Dahiya (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-728) Map-reduce task does not produce correct results when -reducer NONE is specified through streaming
Date Fri, 24 Nov 2006 11:17:05 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-728?page=comments#action_12452419 ] 
Sanjay Dahiya commented on HADOOP-728:

planning to make following changes in streaming for this - 

1. Use PhasedFileSystem for mapoutput  in case of reducer -NONE. Its currently package protected
I have made it public for streaming to be able to use it. This enables maps to generate multiple
files with side effect and avoids duplicate functionality. 

2. Currently in case of reducer -NONE, the output of maps is written explicitly to a DFS file,
so all map tasks try to write to the same file in DFS. (@see PipeMapRed.java:264) causing
this problem. This part is changed to treat <-output> as  directory and map output goes
in this. Each map output file name includes the task id to avoid conflict, PhasedFileSystem
now takes care of speculative maps trying to write to same DFS file. 


> Map-reduce task does not produce correct results when -reducer NONE is specified through
> --------------------------------------------------------------------------------------------------
>                 Key: HADOOP-728
>                 URL: http://issues.apache.org/jira/browse/HADOOP-728
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: dhruba borthakur
>         Assigned To: Sanjay Dahiya
> a) a file is create for the output instead of a directory.
> b) there is no way to understand what is going on from the client output
> I can produce an example for you, if you like -- but the behavior is consistent, so $HSTREAM
-mapper /bin/cat -reducer NONE should show the problem
> ~

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message