hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlo Curino (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5196) CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing
Date Wed, 04 Jun 2014 15:33:02 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14017771#comment-14017771

Carlo Curino commented on MAPREDUCE-5196:

Answering Remus:

(I am not 100% sure, as I wrote this code over a year ago, but let me try to recall) 
As part of the preemption work we explored doing HDFS-based shuffling. 
The benefits of this were:
1) performance enhancements on certain data size ranges (stream-merge on the reducers)
2) the reducer checkpoint state was much smaller (no data, just offset of the last read key
from each map)

That was an initial sperimentation, but making it robust was non-trivial (missing mapoutput
were hard to 
recover) so we didn't push it yet. In that context, the mapOutput was not on localFS but on
HDFS, and 
the change you pointed out was fixing that. But this clearly does not work for windows. My
guess is that
reverting that part should be fine here. 

> CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing 
> ------------------------------------------------------------------------------
>                 Key: MAPREDUCE-5196
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5196
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am, mrv2
>            Reporter: Carlo Curino
>            Assignee: Carlo Curino
>             Fix For: 3.0.0
>         Attachments: MAPREDUCE-5196.1.patch, MAPREDUCE-5196.2.patch, MAPREDUCE-5196.3.patch,
MAPREDUCE-5196.patch, MAPREDUCE-5196.patch
> This JIRA tracks a checkpoint-based AM preemption policy. The policy handles propagation
of the preemption requests received from the RM to the appropriate tasks, and bookeeping of
checkpoints. Actual checkpointing of the task state is handled in upcoming JIRAs.

This message was sent by Atlassian JIRA

View raw message