hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohamed Riadh Trad <Mohamed.T...@inria.fr>
Subject Small Files as input, Heap Size and garbage Collector
Date Tue, 23 Mar 2010 16:00:09 GMT
Hi,

I am running hadoop over a collection of several millions of small files using the CombineFileInputFormat.

However, when generating splits, the job fails because of a Garbage Collector Overhead limit
exceed exception.

I disabled the Garbage Colelctor overhead limit exception with -server -XX:-UseGCOverheadLimit;
I get a java.lang.OutOfMemoryError: Java heap space with -Xmx8192m -server.

Is there any solution to avoid this limit when splitting input?

Regards




Mime
View raw message