hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sandeep das <yarnhad...@gmail.com>
Subject Re: Yarn application reading from Data node using short-circuit.
Date Fri, 20 Nov 2015 08:52:26 GMT
Thanks Chris, I went through the description on the link and found out that
I had not added YARN user in list of allowed users to read from unix
I've added it now and re running the load to see if there is any


On Thu, Nov 19, 2015 at 10:52 PM, Chris Nauroth <cnauroth@hortonworks.com>

> Hello Sandeep,
> As long as you have enabled short-circuit read as per the documentation
> [1], I expect any Hadoop process will take advantage of it while reading a
> local replica.  However, short-circuit read will not completely eliminate
> TCP connection activity to the DataNode.  There will still be a TCP
> connection from the client to the DataNode to perform a handshake and
> establish the Unix domain socket.  This is a very small payload though
> compared to the transfer of block data over the Unix domain socket.
> [1]
> http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html
> --Chris Nauroth
> From: sandeep das <yarnhadoop@gmail.com>
> Reply-To: "user@hadoop.apache.org" <user@hadoop.apache.org>
> Date: Wednesday, November 18, 2015 at 10:44 PM
> To: "user@hadoop.apache.org" <user@hadoop.apache.org>
> Subject: Yarn application reading from Data node using short-circuit.
> Hi,
> I was going through some benchmarking and realized that there are lots of
> TCP connections are initiated while running my PIG jobs over YARN(MR2).
> These TCP connections are related to data node. Although short-circuit is
> enabled in my data nodes but still a lot TCP connections are being created.
> I wanted to check that how can we enable YARN applicationMaster to read
> data from Data node using short-circuits i.e. unix domain sockets. I
> believe that will improve the performance of our jobs.
> Can someone please help to understand how can I make sure that MR2 jobs
> created by PIG scripts are reading data from Data node using short-circuit
> instead of TCP connections?
> Regards,
> Sandeep

View raw message