hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2560) Processing multiple input splits per mapper task
Date Thu, 14 Aug 2008 20:49:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622681#action_12622681

Runping Qi commented on HADOOP-2560:

I like the algorithm Doug outlined above.
A few thoughts for refinement:

Processing N input splits per mapper task may result in map output spills, depending on the
value of N, the split sizes, the value of io.sort.mb, and the nature of the map function.
Thus, N should be  configured by the user on a per job basis. The default should be 1.
N should be chosen in such a way that the mapper tasks processing N splits will not result
in spills most time. 

The actual number of splits a particular mapper task will take should vary, depending on the
number of splits to be processed.
When the number of splits to be processed is low, the number of splits to be processed by
the next mapper task should be reduced, so that other tasks may 
process the splits in parallel.

All the splits processed by the same mapper task should share the same rackid so that rack
locality should be maintained.


> Processing multiple input splits per mapper task
> ------------------------------------------------
>                 Key: HADOOP-2560
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2560
>             Project: Hadoop Core
>          Issue Type: Bug
>            Reporter: Runping Qi
> Currently, an input split contains a consecutive chunk of input file, which by default,
corresponding to a DFS block.
> This may lead to a large number of mapper tasks if the input data is large. This leads
to the following problems:
> 1. Shuffling cost: since the framework has to move M * R map output segments to the nodes
running reducers, 
> larger M means larger shuffling cost.
> 2. High JVM initialization overhead
> 3. Disk fragmentation: larger number of map output files means lower read throughput
for accessing them.
> Ideally, you want to keep the number of mappers to no more than 16 times the number of
 nodes in the cluster.
> To achive that, we can increase the input split size. However, if a split span over more
than one dfs block,
> you lose the data locality scheduling benefits.
> One way to address this problem is to combine multiple input blocks with the same rack
into one split.
> If in average we combine B blocks into one split, then we will reduce the number of mappers
by a factor of B.
> Since all the blocks for one mapper share a rack, thus we can benefit from rack-aware
> Thoughts?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message