hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haohui Mai (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-7291) Persist in-memory replicas using unbuffered IO should only applies to supported Linux version
Date Mon, 27 Oct 2014 06:17:33 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184879#comment-14184879
] 

Haohui Mai edited comment on HDFS-7291 at 10/27/14 6:17 AM:
------------------------------------------------------------

bq. FileChannel#transferTo has no control over the OS buffer cache behavior. Actually, the
one we use before and now as a fallback is apache.common.io.FileUtils#copyFile(), which used
a similar Java NIO API FileChannel#transferFrom. Based on our observations, it churns the
OS buffer a lot during the lazy persist. Only native API sendfile() in Linux 2.6.33+ and CopyFileEx
in Windows can direct OS buffer cache behavior as we desired.

You're right that {{FileUtils#copyFile}} did perform buffer copy, but there is a misconception
about {{FileChannel}}. {{FileChannel#transferTo()}} simply calls {{sendfile()}} / {{CopyFileEx}}
if the kernel supports it -- see http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/nio/ch/FileChannelImpl.java#FileChannelImpl.transferTo%28long%2Clong%2Cjava.nio.channels.WritableByteChannel%29





was (Author: wheat9):
bq. FileChannel#transferTo has no control over the OS buffer cache behavior. Actually, the
one we use before and now as a fallback is apache.common.io.FileUtils#copyFile(), which used
a similar Java NIO API FileChannel#transferFrom. Based on our observations, it churns the
OS buffer a lot during the lazy persist. Only native API sendfile() in Linux 2.6.33+ and CopyFileEx
in Windows can direct OS buffer cache behavior as we desired.

This is misconception. {{FileChannel#transferTo()}} simply calls {{sendfile()}} / {{CopyFileEx}}
if the kernel supports it -- see http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/nio/ch/FileChannelImpl.java#FileChannelImpl.transferTo%28long%2Clong%2Cjava.nio.channels.WritableByteChannel%29

> Persist in-memory replicas using unbuffered IO should only applies to supported Linux
version
> ---------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7291
>                 URL: https://issues.apache.org/jira/browse/HDFS-7291
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>    Affects Versions: 2.6.0
>            Reporter: Xiaoyu Yao
>            Assignee: Xiaoyu Yao
>         Attachments: HDFS-7291.0.patch, HDFS-7291.1.patch
>
>
> HDFS-7090 changes to persist in-memory replicas using unbuffered IO on Linux and Windows.
On Linux distribution, it relies on the sendfile() API between two file descriptors to achieve
unbuffered IO copy. According to Linux document at http://man7.org/linux/man-pages/man2/sendfile.2.html,
this is only supported on Linux kernel 2.6.33+.  This JIRA is to limit the usage of sendfile()
for lazy persist only on Linux distribution with kernel version higher than 2.6.33. For unsupported
version, lazy persist will fallback to normal buffered IO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message