Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 714EB10315 for ; Mon, 4 Nov 2013 23:00:19 +0000 (UTC) Received: (qmail 55681 invoked by uid 500); 4 Nov 2013 23:00:19 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 55627 invoked by uid 500); 4 Nov 2013 23:00:19 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 55617 invoked by uid 99); 4 Nov 2013 23:00:19 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Nov 2013 23:00:19 +0000 Date: Mon, 4 Nov 2013 23:00:19 +0000 (UTC) From: "Daryn Sharp (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (MAPREDUCE-5186) mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813388#comment-13813388 ] Daryn Sharp commented on MAPREDUCE-5186: ---------------------------------------- +1 After a walkthrough of the code, it looks good to me, or at least on par with 1.x. > mapreduce.job.max.split.locations causes some splits created by CombineFileInputFormat to fail > ---------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-5186 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission > Affects Versions: 2.0.4-alpha, 2.2.0 > Reporter: Sangjin Lee > Assignee: Robert Parker > Priority: Critical > Attachments: MAPREDUCE-5186v1.patch, MAPREDUCE-5186v2.patch, MAPREDUCE-5186v3.patch, MAPREDUCE-5186v3.patch > > > CombineFileInputFormat can easily create splits that can come from many different locations (during the last pass of creating "global" splits). However, we observe that this often runs afoul of the mapreduce.job.max.split.locations check that's done by JobSplitWriter. > The default value for mapreduce.job.max.split.locations is 10, and with any decent size cluster, CombineFileInputFormat creates splits that are well above this limit. -- This message was sent by Atlassian JIRA (v6.1#6144)