hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jason hadoop <jason.had...@gmail.com>
Subject Re: sub 60 second performance
Date Sun, 10 May 2009 23:41:39 GMT
You can cache the block in your task, in a pinned static variable, when you
are reusing the jvms.

On Sun, May 10, 2009 at 2:30 PM, Matt Bowyer <mattbowyers@googlemail.com>wrote:

> Hi,
> I am trying to do 'on demand map reduce' - something which will return in
> reasonable time (a few seconds).
> My dataset is relatively small and can fit into my datanode's memory. Is it
> possible to keep a block in the datanode's memory so on the next job the
> response will be much quicker? The majority of the time spent during the
> job
> run appears to be during the 'HDFS_BYTES_READ' part of the job. I have
> tried
> using the setNumTasksToExecutePerJvm but the block still seems to be
> cleared
> from memory after the job.
> thanks!

Alpha Chapters of my book on Hadoop are available
www.prohadoopbook.com a community for Hadoop Professionals

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message