hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1059) distcp can generate uneven map task assignments
Date Wed, 07 Oct 2009 18:09:31 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763163#action_12763163
] 

Aaron Kimball commented on MAPREDUCE-1059:
------------------------------------------

This adds a deprecation warning for an additional use of {{org.apache.hadoop.mapred.FileSplit}}.
Seeing as how these deprecation warnings are left unsuppressed in the rest of DistCp (as they
should be), this is an expected outcome. Core test failure is still unrelated to this patch.

> distcp can generate uneven map task assignments
> -----------------------------------------------
>
>                 Key: MAPREDUCE-1059
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1059
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: distcp
>            Reporter: Aaron Kimball
>            Assignee: Aaron Kimball
>         Attachments: MAPREDUCE-1059.2.patch, MAPREDUCE-1059.patch
>
>
> distcp writes out a SequenceFile containing the source files to transfer, and their sizes.
Map tasks are created over spans of this file, representing files which each mapper should
transfer. In practice, some transfer loads yield many empty map tasks and a few tasks perform
the bulk of the work.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message