hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Zhuge (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HADOOP-14765) AdlFsInputStream should implement unbuffer
Date Tue, 19 Sep 2017 03:03:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171041#comment-16171041
] 

John Zhuge edited comment on HADOOP-14765 at 9/19/17 3:02 AM:
--------------------------------------------------------------

unbuffer is not flush. It does not attempt to write the unwritten data. It just reduces the
buffer. Besides the HBase case, another use case is that Impala's file handle cache calls
unbuffer before it caches the file handle. I believe the JIRA is IMPALA-1588 "Cache HDFS file
handle to avoid repeated hdfs fopen call".

Just need to set ADLFileInputStream#buffer to null where we can save 4MB by default or whatever
read buffer size is set to. No need to close socket.

Unfortunately the current ADLFileInputStream#unbuffer has a slightly different semantics.
It only forces the next read to fetch from server. It does not free the buffer.


was (Author: jzhuge):
unbuffer is not flush. It does not attempt to write the unwritten data. It just reduces the
buffer. Based HBase, another use case is that Impala's file handle cache calls unbuffer before
it caches the file handle. I believe the Impala JIRA is IMPALA-1588 "Cache HDFS file handle
to avoid repeated hdfs fopen call".

Just need to set ADLFileInputStream#buffer to null where we can save 4MB by default or whatever
read buffer size is set to. No need to close socket.

Unfortunately the current ADLFileInputStream#unbuffer has a slightly different semantics.
It only forces the next read to fetch from server. It does not free the buffer.

> AdlFsInputStream should implement unbuffer
> ------------------------------------------
>
>                 Key: HADOOP-14765
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14765
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/adl
>    Affects Versions: 2.8.0
>            Reporter: John Zhuge
>            Priority: Minor
>
> HBase and Impala rely on FileSystems implementing CanUnbuffer.unbuffer() to force input
streams to free up remote connections (HBASE-9393). This works for HDFS, but not elsewhere.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message