hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Murtaza Doctor <murtazadoc...@gmail.com>
Subject Issue: Max block location exceeded for split error when running hive
Date Thu, 19 Sep 2013 00:50:41 GMT
Folks,

Any one run into this issue before:
java.io.IOException: Max block location exceeded for split: Paths:
"/foo/bar...."
....
InputFormatClass: org.apache.hadoop.mapred.TextInputFormat
splitsize: 15 maxsize: 10
at
org.apache.hadoop.mapreduce.split.JobSplitWriter.writeOldSplits(JobSplitWriter.java:162)
at
org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:87)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:501)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:471)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)

When we set the property to something higher as suggested like:
mapreduce.job.max.split.locations = more than on what it failed
then the job runs successfully.

I am trying to dig up additional documentation on this since the default
seems to be 10, not sure how that limit was set.
Additionally what is the recommended value and what factors does it depend
on?

We are running YARN, the actual query is Hive on CDH 4.3, with Hive version
0.10

Any pointers in this direction will be helpful.

Regards,
md

Mime
View raw message