Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DE0ECE544 for ; Fri, 4 Jan 2013 15:30:13 +0000 (UTC) Received: (qmail 55551 invoked by uid 500); 4 Jan 2013 15:30:13 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 55336 invoked by uid 500); 4 Jan 2013 15:30:13 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 55291 invoked by uid 99); 4 Jan 2013 15:30:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Jan 2013 15:30:12 +0000 Date: Fri, 4 Jan 2013 15:30:12 +0000 (UTC) From: "Robert Joseph Evans (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (MAPREDUCE-4912) Investigate ways to clean up double job commit prevention MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Robert Joseph Evans created MAPREDUCE-4912: ---------------------------------------------- Summary: Investigate ways to clean up double job commit prevention Key: MAPREDUCE-4912 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4912 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Robert Joseph Evans Once MAPREDUCE-4819 goes in it fixes the issue where an OutputCommiter can double commit a job. So that the output will never be touched after the job informs externally of success or failure. The code and design could potentially use some cleanup and refactoring. Issues brought up that should be investigated include: # reporting KILL for killed jobs if they crash after the kill happens instead of error. # using the job history log for recording the commit status instead of separate external files in HDFS. # Placing the recovery/retry logic in the commit handler instead of the MRAppMaster, and having the recovery service replay the logs as it normally does for recovery. This is not meant to be things that must be done, but alternatives that might clean up the code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira