Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4923D11AB1 for ; Fri, 22 Aug 2014 23:50:13 +0000 (UTC) Received: (qmail 45079 invoked by uid 500); 22 Aug 2014 23:50:12 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 45013 invoked by uid 500); 22 Aug 2014 23:50:12 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 44999 invoked by uid 99); 22 Aug 2014 23:50:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Aug 2014 23:50:12 +0000 Date: Fri, 22 Aug 2014 23:50:12 +0000 (UTC) From: "Hadoop QA (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (MAPREDUCE-4815) FileOutputCommitter.commitJob can be very slow for jobs with many output files MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107718#comment-14107718 ] Hadoop QA commented on MAPREDUCE-4815: -------------------------------------- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663778/MAPREDUCE-4815.v7.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4817//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4817//console This message is automatically generated. > FileOutputCommitter.commitJob can be very slow for jobs with many output files > ------------------------------------------------------------------------------ > > Key: MAPREDUCE-4815 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4815 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 > Affects Versions: 0.23.3, 2.0.1-alpha, 2.4.1 > Reporter: Jason Lowe > Assignee: Siqi Li > Attachments: MAPREDUCE-4815.v3.patch, MAPREDUCE-4815.v4.patch, MAPREDUCE-4815.v5.patch, MAPREDUCE-4815.v6.patch, MAPREDUCE-4815.v7.patch > > > If a job generates many files to commit then the commitJob method call at the end of the job can take minutes. This is a performance regression from 1.x, as 1.x had the tasks commit directly to the final output directory as they were completing and commitJob had very little to do. The commit work was processed in parallel and overlapped the processing of outstanding tasks. In 0.23/2.x, the commit is single-threaded and waits until all tasks have completed before commencing. -- This message was sent by Atlassian JIRA (v6.2#6252)