hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Bockelman <bbock...@cse.unl.edu>
Subject Re: why does not hdfs read ahead ?
Date Tue, 24 Nov 2009 23:01:14 GMT
Hey Raghu,

There are a few performance issues.  Last week during Supercomputing '09, Caltech was having
issues with getting more than 2.6 Gbps per HDFS client process (I think they were pulling
16 files per process, but Mike knows the details).  I think they'd appreciate any advice you
have about tuning HDFS performance.

We're starting early R&D for 100Gbps dataflows, and I believe improving our current HDFS
performance is on the TODO list.

Brian

(PS - I'm not saying HDFS is at fault here - it always remains a possibility that we're using
it in a sub-optimal manner.  If you have any favorite Java performance instrumentation to
recommend, we'd also be interested in that.)

On Nov 24, 2009, at 12:35 PM, Raghu Angadi wrote:

> Sequential read is the simplest case and it is pretty hard to improve upon
> the current raw performance (HDFS client does take more CPU than one might
> expect, Todd implemented an improvement for CPU consumed).
> 
> Just to reiterate what Todd said, there is an implicit read ahead for
> sequential reads with TCP buffers and kernel read ahead on Datanodes.
> 
> If you extend the read ahead buffer to be more of a buffer cache for the
> block, it could have big impact for some read access patterns (e.g. binary
> search).
> 
> Raghu.
> 
> On Mon, Nov 23, 2009 at 11:23 PM, Martin Mituzas <xietao1981@hotmail.com>wrote:
> 
>> 
>> I read the code and find the call
>> DFSInputStream.read(buf, off, len)
>> will cause the DataNode read len bytes (or less if encounting the end of
>> block) , why does not hdfs read ahead to improve performance for sequential
>> read?
>> --
>> View this message in context:
>> http://old.nabble.com/why-does-not-hdfs-read-ahead---tp26491449p26491449.html
>> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>> 
>> 


Mime
View raw message