hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark question <markq2...@gmail.com>
Subject Reading from File
Date Tue, 26 Apr 2011 18:49:57 GMT

   My mapper opens a file and read records using next() . However, I want to
stop reading if there is no memory available. What confuses me here is that
even though I'm reading record by record with next(), hadoop actually reads
them in dfs.block.size. So, I have two questions:

1. Is it true that even if I set dfs.block.size to 512 MB, then at least one
block is loaded in memory for mapper to process (part of inputSplit)?

2. How can I read multiple records from a sequenceFile at once and will it
make a difference ?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message