hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shubh hadoopExp <shubhhadoop...@gmail.com>
Subject Fwd: Regarding WholeInputFileFormat Java Heap Size error
Date Thu, 12 May 2016 05:15:46 GMT

> 
> 
> Hi All,
> 
> While reading input from directory recursively consisting of files of size 30Mb, using
WholeFileInputFormat and WholeFileRecordReader, I am running into JavaHeapSize error for even
a very small file of 30MB. By default the mapred.child.java.opts is set to -Xmx200m and should
be sufficient enough to run atleast 30MB files present in the directory. 
> 
> The input is a normal random words in file. Each Map is given a single file of size 30MB
and I am reading value as the content of the whole file. And running normal word count.
> 
> If I increase the mapred.child.java.opts size to higher value the applications runs successfully.
But it would be great if anyone can suggest me why mapred.child.java.opts  which is currently
200Mb default for task is not sufficient for 30 MB file, as this means Hadoop MapReduce is
consuming a lot of heap size and out of 200MB it doesn't even use 30Mb to process the task?
Also, is there any other way to read the a large Whole file as a input to a single Map, meaning
every Map gets a whole file to process?
> 
> -Shubh 


Mime
View raw message