hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip <mr.ph...@gmail.com>
Subject OOME only with large datasets
Date Wed, 17 Dec 2008 18:44:51 GMT
I've been trying to trouble shoot an OOME we've been having.

When we run the job over a dataset that about 700GB (~9000 files) or larger
we will get an OOME on the map jobs.  However if we run the job over smaller
set of the data then everything works out fine.  So my question is: What
changes in Hadoop as the size of the input set increases?

We are on hadoop 0.18.0.

Here's is a stack trace produced by the job tracker.
java.lang.OutOfMemoryError: Java heap space at
java.util.Arrays.copyOf(Arrays.java:2882) at
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at
java.lang.StringBuffer.append(StringBuffer.java:224) at
com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeValueString(DeferredDocumentImpl.java:1167)
at
com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeValueString(DeferredDocumentImpl.java:1120)
at
com.sun.org.apache.xerces.internal.dom.DeferredTextImpl.synchronizeData(DeferredTextImpl.java:93)
at
com.sun.org.apache.xerces.internal.dom.CharacterDataImpl.getData(CharacterDataImpl.java:160)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:928)
at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:851)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:819) at
org.apache.hadoop.conf.Configuration.get(Configuration.java:278) at
org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:446) at
org.apache.hadoop.mapred.JobConf.getKeepFailedTaskFiles(JobConf.java:308) at
org.apache.hadoop.mapred.TaskTracker$TaskInProgress.setJobConf(TaskTracker.java:1506)
at
org.apache.hadoop.mapred.TaskTracker.launchTaskForJob(TaskTracker.java:727)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:721) at
org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1306) at
org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:946) at
org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1343) at
org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2354)


Thanks,
Philip.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message