hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hitesh Shah (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled
Date Fri, 23 Sep 2016 00:17:20 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514897#comment-15514897
] 

Hitesh Shah edited comment on MAPREDUCE-6638 at 9/23/16 12:17 AM:
------------------------------------------------------------------

bq. (1) Avoid recovering an AM if encrypted spill is enabled

Encrypted spill w.r.t recovery is not the same as a committer not supporting recovery. Any
reason we cannot just re-run the job from scratch if all reducers have not completed ( or
re-run all maps and incomplete reducers )?

Ideally speaking, you could just re-run most of the job tasks again if needed to support proper
fault tolerance even in scenarios where the key cannot be stored securely. In this scenario,
the new AM can generate a new key. I would agree that this might not be a performant solution
but it atleast solves the problem of not having the user to re-submit the job. If performance
is an issue, users can turn off recovery when encryption is enabled for scenarios where the
key cannot be stored securely.


was (Author: hitesh):
bq. (1) Avoid recovering an AM if encrypted spill is enabled

Encrypted spill w.r.t recovery is not the same as a committer not supporting recovery. Any
reason we cannot just re-run the job from scratch if all reducers have not completed?

Ideally speaking, you could just re-run most of the job tasks again if needed to support proper
fault tolerance even in scenarios where the key cannot be stored securely. In this scenario,
the new AM can generate a new key. I would agree that this might not be a performant solution
but it atleast solves the problem of not having the user to re-submit the job. If performance
is an issue, users can turn off recovery when encryption is enabled for scenarios where the
key cannot be stored securely.

> Do not attempt to recover jobs if encrypted spill is enabled
> ------------------------------------------------------------
>
>                 Key: MAPREDUCE-6638
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster
>    Affects Versions: 2.7.2
>            Reporter: Karthik Kambatla
>            Assignee: Haibo Chen
>         Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch, mapreduce6638.003.patch,
mapreduce6638.004.patch, mapreduce6683.005.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be recovered
if the AM fails. We should store the key some place safe so they can actually be recovered.
If there is no "safe" place, at least we should restart the job by re-running all mappers/reducers.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message