hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chia-Ping Tsai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17623) Reuse the bytes array when building the hfile block
Date Tue, 21 Mar 2017 19:21:41 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935151#comment-15935151
] 

Chia-Ping Tsai commented on HBASE-17623:
----------------------------------------

bq. In HFileBlock I can see above javadoc. But we dont use the API getBlockForCaching() for
caching a read block right.
The javadoc "is used when the block data has already been read" is written by HBASE-15366.
And the "uncompressed" is broken by HBASE-11331. i'd think that the javadoc is out-of-date.
i will correct it.

bq. Did some search in the code and seems we dont
Done. The block generated by getBlockForCaching() is cached only when writing blocks.

bq. Around this method add a note that this should be used only while writing blocks and caching
and mention abt the copy we do
patch is coming soon.

Thanks for the comment. [~anoop.hbase]]


> Reuse the bytes array when building the hfile block
> ---------------------------------------------------
>
>                 Key: HBASE-17623
>                 URL: https://issues.apache.org/jira/browse/HBASE-17623
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Chia-Ping Tsai
>            Assignee: Chia-Ping Tsai
>             Fix For: 2.0.0, 1.4.0
>
>         Attachments: after(snappy_hfilesize=5.04GB).png, after(snappy_hfilesize=755MB).png,
before(snappy_hfilesize=5.04GB).png, before(snappy_hfilesize=755MB).png, GC measurement.xlsx,
HBASE-17623.branch-1.v0.patch, HBASE-17623.branch-1.v1.patch, HBASE-17623.branch-1.v2.patch,
HBASE-17623.branch-1.v2.patch, HBASE-17623.v0.patch, HBASE-17623.v1.patch, HBASE-17623.v1.patch,
HBASE-17623.v2.patch, memory allocation measurement.xlsx
>
>
> There are two improvements.
> # The onDiskBlockBytesWithHeader should maintain a bytes array which can be reused when
building the hfile.
> # The onDiskBlockBytesWithHeader is copied to an new bytes array only when we need to
cache the block.
> # If no block need to be cached, the uncompressedBlockBytesWithHeader will never be created.
> {code:title=HFileBlock.java|borderStyle=solid}
>     private void finishBlock() throws IOException {
>       if (blockType == BlockType.DATA) {
>         this.dataBlockEncoder.endBlockEncoding(dataBlockEncodingCtx, userDataStream,
>             baosInMemory.getBuffer(), blockType);
>         blockType = dataBlockEncodingCtx.getBlockType();
>       }
>       userDataStream.flush();
>       // This does an array copy, so it is safe to cache this byte array when cache-on-write.
>       // Header is still the empty, 'dummy' header that is yet to be filled out.
>       uncompressedBlockBytesWithHeader = baosInMemory.toByteArray();
>       prevOffset = prevOffsetByType[blockType.getId()];
>       // We need to set state before we can package the block up for cache-on-write.
In a way, the
>       // block is ready, but not yet encoded or compressed.
>       state = State.BLOCK_READY;
>       if (blockType == BlockType.DATA || blockType == BlockType.ENCODED_DATA) {
>         onDiskBlockBytesWithHeader = dataBlockEncodingCtx.
>             compressAndEncrypt(uncompressedBlockBytesWithHeader);
>       } else {
>         onDiskBlockBytesWithHeader = defaultBlockEncodingCtx.
>             compressAndEncrypt(uncompressedBlockBytesWithHeader);
>       }
>       // Calculate how many bytes we need for checksum on the tail of the block.
>       int numBytes = (int) ChecksumUtil.numBytes(
>           onDiskBlockBytesWithHeader.length,
>           fileContext.getBytesPerChecksum());
>       // Put the header for the on disk bytes; header currently is unfilled-out
>       putHeader(onDiskBlockBytesWithHeader, 0,
>           onDiskBlockBytesWithHeader.length + numBytes,
>           uncompressedBlockBytesWithHeader.length, onDiskBlockBytesWithHeader.length);
>       // Set the header for the uncompressed bytes (for cache-on-write) -- IFF different
from
>       // onDiskBlockBytesWithHeader array.
>       if (onDiskBlockBytesWithHeader != uncompressedBlockBytesWithHeader) {
>         putHeader(uncompressedBlockBytesWithHeader, 0,
>           onDiskBlockBytesWithHeader.length + numBytes,
>           uncompressedBlockBytesWithHeader.length, onDiskBlockBytesWithHeader.length);
>       }
>       if (onDiskChecksum.length != numBytes) {
>         onDiskChecksum = new byte[numBytes];
>       }
>       ChecksumUtil.generateChecksums(
>           onDiskBlockBytesWithHeader, 0, onDiskBlockBytesWithHeader.length,
>           onDiskChecksum, 0, fileContext.getChecksumType(), fileContext.getBytesPerChecksum());
>     }{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message