hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8148) Zero-copy ByteBuffer-based compressor / decompressor API
Date Fri, 20 Apr 2012 02:38:42 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257987#comment-13257987
] 

Todd Lipcon commented on HADOOP-8148:
-------------------------------------

Duplicating my comment from HADOOP-8258:

{quote}
In current versions of Hadoop, the read path for applications like HBase often looks like:

allocate a byte array for an HFile block (~64kb)
call read() into that byte array:
copy 1: read() packets from the socket into a direct buffer provided by the DirectBufferPool
copy 2: copy from the direct buffer pool into the provided byte[]
call setInput on a decompressor
copy 3: copy from the byte[] back to a direct buffer inside the codec implementation
call decompress:
JNI code accesses the input buffer and writes to the output buffer
copy 4: from the output buffer back into the byte[] for the uncompressed hfile block
ineffiency: HBase now does its own checksumming. Since it has to checksum the byte[], it can't
easily use the SSE-enabled checksum path.
Given the new direct-buffer read support introduced by HDFS-2834, we can remove copy #2 and
#3

allocate a DirectBuffer for the compressed hfile block, and one for the uncompressed block
(we know the size from the hfile block header)
call read() into the direct buffer using the HDFS-2834 API
copy 1: read() packets from the socket into that buffer
call setInput() with that buffer. no copies necessary
call decompress:
JNI code accesses the input buffer and writes directly to the output buffer, with no copies
HBase now has the uncompressed block as a direct buffer. It can use the SSE-enabled checksum
for better efficiency
This should improve the performance of HBase significantly. We may also be able to use the
new API from within SequenceFile and other compressible file formats to avoid two copies from
the read path.

Similar applies to the write path, but in my experience the write path is less often CPU-constrained,
so I'd prefer to concentrate on the read path first.
{quote}
                
> Zero-copy ByteBuffer-based compressor / decompressor API
> --------------------------------------------------------
>
>                 Key: HADOOP-8148
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8148
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: io
>            Reporter: Tim Broberg
>            Assignee: Tim Broberg
>         Attachments: hadoop8148.patch
>
>
> Per Todd Lipcon's comment in HDFS-2834, "
>   Whenever a native decompression codec is being used, ... we generally have the following
copies:
>   1) Socket -> DirectByteBuffer (in SocketChannel implementation)
>   2) DirectByteBuffer -> byte[] (in SocketInputStream)
>   3) byte[] -> Native buffer (set up for decompression)
>   4*) decompression to a different native buffer (not really a copy - decompression necessarily
rewrites)
>   5) native buffer -> byte[]
>   with the proposed improvement we can hopefully eliminate #2,#3 for all applications,
and #2,#3,and #5 for libhdfs.
> "
> The interfaces in the attached patch attempt to address:
>  A - Compression and decompression based on ByteBuffers (HDFS-2834)
>  B - Zero-copy compression and decompression (HDFS-3051)
>  C - Provide the caller a way to know how the max space required to hold compressed output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message