Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 929BD11B1F for ; Thu, 17 Jul 2014 18:55:06 +0000 (UTC) Received: (qmail 13255 invoked by uid 500); 17 Jul 2014 18:55:06 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 13202 invoked by uid 500); 17 Jul 2014 18:55:06 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 13187 invoked by uid 99); 17 Jul 2014 18:55:06 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Jul 2014 18:55:06 +0000 Date: Thu, 17 Jul 2014 18:55:06 +0000 (UTC) From: "Allen Wittenauer (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (HDFS-281) Explore usage of the sendfile api via java.nio.channels.FileChannel.transfer{To|From} for i/o in datanodes MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HDFS-281. ----------------------------------- Resolution: Won't Fix I'm going to close this as a duplicate of HDFS-2246. While the path chosen was different, the end result was essentially the same. > Explore usage of the sendfile api via java.nio.channels.FileChannel.transfer{To|From} for i/o in datanodes > ---------------------------------------------------------------------------------------------------------- > > Key: HDFS-281 > URL: https://issues.apache.org/jira/browse/HDFS-281 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Arun C Murthy > > We could potentially gain a lot of performance by using the *sendfile* system call: > $ man sendfile > {noformat} > DESCRIPTION > This call copies data between one file descriptor and another. Either or both of these file descriptors may refer to a socket (but see below). > in_fd should be a file descriptor opened for reading and out_fd should be a descriptor opened for writing. offset is a pointer to a variable > holding the input file pointer position from which sendfile() will start reading data. When sendfile() returns, this variable will be set to the > offset of the byte following the last byte that was read. count is the number of bytes to copy between file descriptors. > Because this copying is done within the kernel, sendfile() does not need to spend time transferring data to and from user space. > {noformat} > The nio package offers this via the java.nio.channels.FileChannel.transfer{To|From} apis: > http://java.sun.com/j2se/1.5.0/docs/api/java/nio/channels/FileChannel.html#transferFrom(java.nio.channels.ReadableByteChannel,%20long,%20long) > http://java.sun.com/j2se/1.5.0/docs/api/java/nio/channels/FileChannel.html#transferTo(long,%20long,%20java.nio.channels.WritableByteChannel) > From the javadocs: > {noformat} > This method is potentially much more efficient than a simple loop that reads from this channel and writes to the target channel. Many operating systems can transfer bytes directly from the filesystem cache to the target channel without actually copying them. > {noformat} > ---- > Hence, this could well-worth exploring for doing io at the datanodes... -- This message was sent by Atlassian JIRA (v6.2#6252)