Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ED93410743 for ; Thu, 19 Sep 2013 08:34:40 +0000 (UTC) Received: (qmail 85052 invoked by uid 500); 19 Sep 2013 08:34:35 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 84963 invoked by uid 500); 19 Sep 2013 08:34:34 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 84956 invoked by uid 99); 19 Sep 2013 08:34:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Sep 2013 08:34:33 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates 209.85.214.173 as permitted sender) Received: from [209.85.214.173] (HELO mail-ob0-f173.google.com) (209.85.214.173) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Sep 2013 08:34:28 +0000 Received: by mail-ob0-f173.google.com with SMTP id vb8so9472913obc.32 for ; Thu, 19 Sep 2013 01:34:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=UW4d/u33YvKjzmXSScBzV4bFtmaixdcJC5Gza+oKGvQ=; b=fZJCKOiFqGrmNPmIEXQdf4EnekVhWjJEPz6d3EcwrABY1cS3d14BmTENFRZdjJOQ9g ftWV4gLA4x0NlpIZn8cSmjmTUfW9MJoFbYf1vpuAkBh54ZWZzoes3XW78Z8+C6ga0vHf KRyFdwA9I1OfmhVN6+pIaC3NkiqS05DBtUTJ8pcqaY+k4L5LJ0QiowA3lGoFvlC6saNJ a8ZIRgF3vflviddNKEvLt4ixY9wPXDPMvchnyWr5pu4v3FGtX9tp8dCyzU+t20ksrjNa 16vZVfhdV9VPkeM48roY+1vqApqIr4LGEQLS4JQ0iV3RxL3WVz0A2IZA4DYXWxmEeawk ldFg== X-Gm-Message-State: ALoCoQmy3TuypCtqnKkpAKdL0mD3ZbcZ19JiSvWuK9FvdcQYq2ivl+D5+pK08vhWZe9UzQmEO8LZ X-Received: by 10.60.103.146 with SMTP id fw18mr239383oeb.32.1379579646876; Thu, 19 Sep 2013 01:34:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.95.105 with HTTP; Thu, 19 Sep 2013 01:33:46 -0700 (PDT) In-Reply-To: References: From: Harsh J Date: Thu, 19 Sep 2013 14:03:46 +0530 Message-ID: Subject: Re: Issue: Max block location exceeded for split error when running hive To: "" Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Are you using a CombineFileInputFormat or similar input format then, perhaps? On Thu, Sep 19, 2013 at 1:29 PM, Murtaza Doctor wrote: > We are using the default replication factor of 3. When new files are put on > HDFS we never override the replication factor. When there is more data > involved it fails on a larger split size. > > > On Wed, Sep 18, 2013 at 6:34 PM, Harsh J wrote: >> >> Do your input files carry a replication factor of 10+? That could be >> one cause behind this. >> >> On Thu, Sep 19, 2013 at 6:20 AM, Murtaza Doctor >> wrote: >> > Folks, >> > >> > Any one run into this issue before: >> > java.io.IOException: Max block location exceeded for split: Paths: >> > "/foo/bar...." >> > .... >> > InputFormatClass: org.apache.hadoop.mapred.TextInputFormat >> > splitsize: 15 maxsize: 10 >> > at >> > >> > org.apache.hadoop.mapreduce.split.JobSplitWriter.writeOldSplits(JobSplitWriter.java:162) >> > at >> > >> > org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:87) >> > at >> > >> > org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:501) >> > at >> > >> > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:471) >> > at >> > >> > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366) >> > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269) >> > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266) >> > at java.security.AccessController.doPrivileged(Native Method) >> > at javax.security.auth.Subject.doAs(Subject.java:415) >> > at >> > >> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) >> > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266) >> > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606) >> > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601) >> > at java.security.AccessController.doPrivileged(Native Method) >> > at javax.security.auth.Subject.doAs(Subject.java:415) >> > at >> > >> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) >> > at >> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601) >> > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586) >> > at >> > org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447) >> > >> > When we set the property to something higher as suggested like: >> > mapreduce.job.max.split.locations = more than on what it failed >> > then the job runs successfully. >> > >> > I am trying to dig up additional documentation on this since the default >> > seems to be 10, not sure how that limit was set. >> > Additionally what is the recommended value and what factors does it >> > depend >> > on? >> > >> > We are running YARN, the actual query is Hive on CDH 4.3, with Hive >> > version >> > 0.10 >> > >> > Any pointers in this direction will be helpful. >> > >> > Regards, >> > md >> >> >> >> -- >> Harsh J > > -- Harsh J