hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: How to increase data processed by one JVM instance
Date Tue, 09 Nov 2010 09:47:37 GMT

On Tue, Nov 9, 2010 at 2:55 PM, Jyothish Soman <jyothish.soman@gmail.com> wrote:
> Hello,
> I wanted to know how to increase the data processed by a single JVM
> instance. What options are needed for this, and where to put them up.

What do you exactly mean by increasing the "data processed" part?

In case you're running into out-of-memory issues, look at the
"mapred.child.java.opts" property to increase the Heap Size allocated
to each Task JVM under a TaskTracker.

If you're looking to increase the minimum split size of each mapper to
act upon (which defaults to the block size if am right), the property
"mapred.min.split.size", set in bytes, can help you with that
(although certain InputFormats may override this). You can also copy
the data on the HDFS around with a new block size set.


Harsh J

View raw message