hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7291) Persist in-memory replicas using unbuffered IO should only applies to supported Linux version
Date Mon, 27 Oct 2014 04:48:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184830#comment-14184830
] 

Chris Nauroth commented on HDFS-7291:
-------------------------------------

Thanks for working on this, Xiaoyu.  I agree with Allen's suggestion to use the {{uname}}
function defined in sys/utsname.h, since we only ever need this information after we already
know native code is loaded.  Do you think that's a good idea?

Here are some comments on the current patch.  A lot of it likely won't be relevant if you
switch to a direct {{uname}} call, but maybe it's helpful in general.

There is an established convention (mostly) of all caps for naming static variables, so instead
of {{isLinuxSendfileAvailable}} it would be {{IS_LINUX_SENDFILE_AVAILABLE}}.

{code}
      if (version == null) {
        throw new IllegalArgumentException();
      }
{code}

{code}
      if (parts.length != 3) {
        throw new IllegalArgumentException(version);
      }
{code}

It might be helpful to provide exception messages with each of these.  Using {{Preconditions#checkArgument}}
would reduce each to a single line of code.

{code}
  private static LinuxKernelVersion minLkvSupportSendfile =
      new LinuxKernelVersion((short)2, (short)6, (short)33);
{code}

I recommend marking this {{final}}, because it would never be correct to overwrite {{minLkvSupportSendfile}}
at runtime.  Similar to above, {{MIN_LKV_SUPPORT_SENDFILE}} would better fit the naming convention.

{code}
  public static class LinuxKernelVersion implements Comparable<LinuxKernelVersion>{
{code}

It appears that this is {{public}} only so that {{TestShell}} can access it.  Otherwise, it's
an internal implementation detail of {{Shell}}.  If that's true, then would you please make
it package-private instead of public and apply the {{VisibleForTesting}} annotation?

{code}
      String[] args = {"uname", "bash", "-r"};
{code}

The "bash" needs to be removed, so that this is \{"uname", "-r"\} instead.  Currently, this
is causing the following to show up in logs for the test run, and incorrectly identifying
that {{sendfile}} is not supported on my system.

{code}
2014-10-26 21:27:52,094 WARN  util.Shell (Shell.java:isLinuxSendfileSupported(481)) - isLinuxSendfileSupported()
failed unexpected: ExitCodeException exitCode=1: uname: extra operand `bash'
{code}

Also, it would be valuable to add a test that actually checks that {{isLinuxSendfileAvailable}}
is {{true}} if the test is running on a system with kernel version >= 2.6.33.

{code}
      LOG.warn("isLinuxSendfileSupported() failed unexpected: " + e);
{code}

Let's change this log message to debug level.  In its current form, there would always be
a warning logged one time for every process (including user-facing interactive Hadoop shell
commands) when running on systems that don't have {{sendfile}}, because it's triggered from
static initialization of the class.  This could be noisy, and since our intention is to fallback
gracefully, I think debug level is acceptable.


> Persist in-memory replicas using unbuffered IO should only applies to supported Linux
version
> ---------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7291
>                 URL: https://issues.apache.org/jira/browse/HDFS-7291
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>    Affects Versions: 2.6.0
>            Reporter: Xiaoyu Yao
>            Assignee: Xiaoyu Yao
>         Attachments: HDFS-7291.0.patch, HDFS-7291.1.patch
>
>
> HDFS-7090 changes to persist in-memory replicas using unbuffered IO on Linux and Windows.
On Linux distribution, it relies on the sendfile() API between two file descriptors to achieve
unbuffered IO copy. According to Linux document at http://man7.org/linux/man-pages/man2/sendfile.2.html,
this is only supported on Linux kernel 2.6.33+.  This JIRA is to limit the usage of sendfile()
for lazy persist only on Linux distribution with kernel version higher than 2.6.33. For unsupported
version, lazy persist will fallback to normal buffered IO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message