hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14111) hdfsOpenFile on HDFS causes unnecessary IO from file offset 0
Date Mon, 25 Feb 2019 23:11:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16777375#comment-16777375

Todd Lipcon commented on HDFS-14111:

        jthr = newJavaStr(env, "in:readbytebuffer", &jCapabilityString);
        jthr = invokeMethod(env, &jVal, INSTANCE, jFile, HADOOP_ISTRM,
                   "hasCapability", "(Ljava/lang/String;)Z", jCapabilityString);

We probably need to check 'jthr' from the newJavaStr before moving along to invokeMethod,
even though it's highly unlikely to hit an issue. (it could OOM)

Otherwise I think this makes sense. [~stevel@apache.org] does the StreamCapabiltiies change
look good to you?

> hdfsOpenFile on HDFS causes unnecessary IO from file offset 0
> -------------------------------------------------------------
>                 Key: HDFS-14111
>                 URL: https://issues.apache.org/jira/browse/HDFS-14111
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client, libhdfs
>    Affects Versions: 3.2.0
>            Reporter: Todd Lipcon
>            Assignee: Sahil Takiar
>            Priority: Major
>         Attachments: HDFS-14111.001.patch, HDFS-14111.002.patch
> hdfsOpenFile() calls readDirect() with a 0-length argument in order to check whether
the underlying stream supports bytebuffer reads. With DFSInputStream, the read(0) isn't short
circuited, and results in the DFSClient opening a block reader. In the case of a remote block,
the block reader will actually issue a read of the whole block, causing the datanode to perform
unnecessary IO and network transfers in order to fill up the client's TCP buffers. This causes
performance degradation.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message