hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4749) reducer should output input data size and record count when shuffling is done
Date Fri, 05 Dec 2008 19:06:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653894#action_12653894
] 

Zheng Shao commented on HADOOP-4749:
------------------------------------

Let's keep it simple and just output input data size for now. Sorting does take a long time
(especially with load balance problem) so I don't want to wait till that is done.


> reducer should output input data size and record count when shuffling is done
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-4749
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4749
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Zheng Shao
>
> Sometimes we see a single slow reducer because of the load balancing problem. This information
will be very useful to understand how imbalanced the load is.
> Should be easy to fix I guess, since reducer should have all information needed at the
end of the shuffling phase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message