hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Nauroth <cnaur...@hortonworks.com>
Subject Re: HDFS short-circuit reads
Date Tue, 17 Dec 2013 01:07:25 GMT
Hello John,

Short-circuit reads are not on by default.  The documentation page you
linked to at hadoop.apache.org contains all of the information you need to
enable them though.

Regarding checking status of short-circuit read programmatically, here are
a few thoughts on this:

Your application could check Configuration for the
dfs.client.read.shortcircuit key.  This will tell you at a high level if
the feature is enabled.  However, note that the feature needs to be turned
on in configuration for both the DataNode and the HDFS client process.
 Depending on the details of the deployment, the DataNode and the client
might be using different configuration files.

This tells you if the feature is enabled, but it doesn't necessarily tell
you if you're really going to get short-circuit reads when you open the
file.  There might not be a local replica for the block, in which case the
read would fall back to the typical remote read behavior anyway.

Depending on what your application wants to achieve, you might also be
interested in looking at the FileSystem.listLocatedStatus API to query
information about blocks and the corresponding locations of replicas.
 Applications like MapReduce use this information to try to schedule their
work for optimal locality.  Short-circuit reads then become a further
optimization on top of the gains already achieved by locality.

Hope this helps,

Chris Nauroth

On Mon, Dec 16, 2013 at 4:21 PM, John Lilley <john.lilley@redpoint.net>wrote:

>  Our YARN application would benefit from maximal bandwidth on HDFS reads.
> But I’m unclear on how short-circuit reads are enabled.
> Are they on by default?
> Can our application check programmatically to see if the short-circuit
> read is enabled?
> *Thanks,*
> *john*
> RE:
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html
> https://issues.apache.org/jira/browse/HDFS-347

NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

View raw message