Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3E7D117E1D for ; Mon, 27 Oct 2014 02:36:34 +0000 (UTC) Received: (qmail 77994 invoked by uid 500); 27 Oct 2014 02:36:33 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 77932 invoked by uid 500); 27 Oct 2014 02:36:33 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 77921 invoked by uid 99); 27 Oct 2014 02:36:33 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Oct 2014 02:36:33 +0000 Date: Mon, 27 Oct 2014 02:36:33 +0000 (UTC) From: "Xiaoyu Yao (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-7291) Persist in-memory replicas using unbuffered IO should only applies to supported Linux version MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184768#comment-14184768 ] Xiaoyu Yao commented on HDFS-7291: ---------------------------------- FileChannel#transferTo only avoids a *Java NIO buffer* between reader/writer as described in JDK document below. What we want to avoid is introducing in-memory replicas into *OS buffer* during lazy persist as these blocks already occupy memory from RAM_DISK. "This method is potentially much more efficient than a simple loop that reads from the source channel and writes to this channel. Many operating systems can transfer bytes directly from the source channel into the filesystem cache without actually copying them." FileChannel#transferTo has no control over the OS buffer cache behavior. Actually, the one we use before and now as a fallback is apache.common.io.FileUtils#copyFile(), which used a similar Java NIO API FileChannel#transferFrom. Based on our observations, it churns the OS buffer a lot during the lazy persist. Only native API sendfile() in Linux 2.6.33+ and CopyFileEx in Windows can direct OS buffer cache behavior as we desired. > Persist in-memory replicas using unbuffered IO should only applies to supported Linux version > --------------------------------------------------------------------------------------------- > > Key: HDFS-7291 > URL: https://issues.apache.org/jira/browse/HDFS-7291 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode > Affects Versions: 2.6.0 > Reporter: Xiaoyu Yao > Assignee: Xiaoyu Yao > Attachments: HDFS-7291.0.patch > > > HDFS-7090 changes to persist in-memory replicas using unbuffered IO on Linux and Windows. On Linux distribution, it relies on the sendfile() API between two file descriptors to achieve unbuffered IO copy. According to Linux document at http://man7.org/linux/man-pages/man2/sendfile.2.html, this is only supported on Linux kernel 2.6.33+. This JIRA is to limit the usage of sendfile() for lazy persist only on Linux distribution with kernel version higher than 2.6.33. For unsupported version, lazy persist will fallback to normal buffered IO. -- This message was sent by Atlassian JIRA (v6.3.4#6332)