hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rottinghuis, Joep" <jrottingh...@ebay.com>
Subject RE: java.io.IOException: Split metadata size exceeded 10000000
Date Thu, 17 Mar 2011 19:01:34 GMT
Doubt this is a CDH3 issue.
We saw the same with a large job using the 0.20-security branch.

There is a property (mapreduce.jobtracker.split.metainfo.maxsize) that can be used to override
the default of 10^6.
We found that passing this along with the job has no effect, this worked only when setting
this property on the jobtracker node. Not sure if this is a feature or a bug.

Cheers,

Joep

-----Original Message-----
From: Harsh J [mailto:qwertymaniac@gmail.com] 
Sent: Tuesday, March 15, 2011 3:33 AM
To: CDH Users
Cc: wlangiewicz@gmail.com
Subject: Re: java.io.IOException: Split metadata size exceeded 10000000

Moving this discussion to the CDH users list at cdh-user [at]
cloudera.org since it could be a CDH specific issue.

[Bcc: general]

On Tue, Mar 15, 2011 at 3:25 PM, Wojciech Langiewicz
<wlangiewicz@gmail.com> wrote:
> Hello,
> I'm having this problem running mapreduce jobs over about 10TB of data
> (smaller jobs are ok):
> 2011-03-15 07:48:22,031 ERROR org.apache.hadoop.mapred.JobTracker: Job
> initialization failed:
> java.io.IOException: Split metadata size exceeded 10000000. Aborting job
> job_201103141436_0058
>        at
> org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:48)
>        at
> org.apache.hadoop.mapred.JobInProgress.createSplits(JobInProgress.java:732)
>        at
> org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:633)
>        at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3965)
>        at
> org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79)
>        at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:619)
>
> 2011-03-15 07:48:22,031 INFO org.apache.hadoop.mapred.JobTracker: Failing
> job job_201103141436_0058
>
> What settings should I change to run this job?
> I'm using CDH3b3.
> Thanks for all answers.
>
> --
> Wojciech Langiewcz
>



-- 
Harsh J
http://harshj.com

Mime
View raw message