hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-21879) Read HFile's block to ByteBuffer directly instead of to byte for reducing young gc purpose
Date Sun, 16 Jun 2019 13:58:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-21879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865013#comment-16865013
] 

Hudson commented on HBASE-21879:
--------------------------------

Results for branch HBASE-21879
	[build #145 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21879/145/]:
(x) *{color:red}-1 overall{color}*
----
details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21879/145//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21879/145//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21879/145//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Read HFile's block to ByteBuffer directly instead of to byte for reducing young gc purpose
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-21879
>                 URL: https://issues.apache.org/jira/browse/HBASE-21879
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>             Fix For: 3.0.0, 2.3.0
>
>         Attachments: HBASE-21879.v1.patch, HBASE-21879.v1.patch, QPS-latencies-before-HBASE-21879.png,
gc-data-before-HBASE-21879.png
>
>
> In HFileBlock#readBlockDataInternal,  we have the following: 
> {code}
> @VisibleForTesting
> protected HFileBlock readBlockDataInternal(FSDataInputStream is, long offset,
>     long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, boolean updateMetrics)
>  throws IOException {
>  // .....
>   // TODO: Make this ByteBuffer-based. Will make it easier to go to HDFS with BBPool
(offheap).
>   byte [] onDiskBlock = new byte[onDiskSizeWithHeader + hdrSize];
>   int nextBlockOnDiskSize = readAtOffset(is, onDiskBlock, preReadHeaderSize,
>       onDiskSizeWithHeader - preReadHeaderSize, true, offset + preReadHeaderSize, pread);
>   if (headerBuf != null) {
>         // ...
>   }
>   // ...
>  }
> {code}
> In the read path,  we still read the block from hfile to on-heap byte[], then copy the
on-heap byte[] to offheap bucket cache asynchronously,  and in my  100% get performance test,
I also observed some frequent young gc,  The largest memory footprint in the young gen should
be the on-heap block byte[].
> In fact, we can read HFile's block to ByteBuffer directly instead of to byte[] for reducing
young gc purpose. we did not implement this before, because no ByteBuffer reading interface
in the older HDFS client, but 2.7+ has supported this now,  so we can fix this now. I think.

> Will provide an patch and some perf-comparison for this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message