hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Bowyer <mattbowy...@googlemail.com>
Subject sub 60 second performance
Date Sun, 10 May 2009 21:30:10 GMT

I am trying to do 'on demand map reduce' - something which will return in
reasonable time (a few seconds).

My dataset is relatively small and can fit into my datanode's memory. Is it
possible to keep a block in the datanode's memory so on the next job the
response will be much quicker? The majority of the time spent during the job
run appears to be during the 'HDFS_BYTES_READ' part of the job. I have tried
using the setNumTasksToExecutePerJvm but the block still seems to be cleared
from memory after the job.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message