hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Murtaza Doctor <murtazadoc...@gmail.com>
Subject Re: Issue: Max block location exceeded for split error when running hive
Date Thu, 19 Sep 2013 07:59:39 GMT
We are using the default replication factor of 3.  When new files are put
on HDFS we never override the replication factor. When there is more data
involved it fails on a larger split size.


On Wed, Sep 18, 2013 at 6:34 PM, Harsh J <harsh@cloudera.com> wrote:

> Do your input files carry a replication factor of 10+? That could be
> one cause behind this.
>
> On Thu, Sep 19, 2013 at 6:20 AM, Murtaza Doctor <murtazadoctor@gmail.com>
> wrote:
> > Folks,
> >
> > Any one run into this issue before:
> > java.io.IOException: Max block location exceeded for split: Paths:
> > "/foo/bar...."
> > ....
> > InputFormatClass: org.apache.hadoop.mapred.TextInputFormat
> > splitsize: 15 maxsize: 10
> > at
> >
> org.apache.hadoop.mapreduce.split.JobSplitWriter.writeOldSplits(JobSplitWriter.java:162)
> > at
> >
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:87)
> > at
> >
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:501)
> > at
> >
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:471)
> > at
> >
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366)
> > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269)
> > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:415)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266)
> > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
> > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:415)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> > at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
> > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
> > at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)
> >
> > When we set the property to something higher as suggested like:
> > mapreduce.job.max.split.locations = more than on what it failed
> > then the job runs successfully.
> >
> > I am trying to dig up additional documentation on this since the default
> > seems to be 10, not sure how that limit was set.
> > Additionally what is the recommended value and what factors does it
> depend
> > on?
> >
> > We are running YARN, the actual query is Hive on CDH 4.3, with Hive
> version
> > 0.10
> >
> > Any pointers in this direction will be helpful.
> >
> > Regards,
> > md
>
>
>
> --
> Harsh J
>

Mime
View raw message