hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao Kang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-6662) hadoop zlib compression does not fully utilize the buffer
Date Tue, 30 Mar 2010 06:44:27 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Xiao Kang updated HADOOP-6662:

    Attachment: ZlibCompressor.patch

Patch attached.

needsInput() check the uncompressedDirectBuf, if it is full return false, else copy data from
saved userBuf and then recheck.

A special case, that the input uncompressedDirectBuf is not all comsumed by zlib due to output
buffer is not enough, should be respected. It may be the reason the original code just return
false if  uncompressedBufLen > 0.

After JNI compress invoked, uncompressedBufLen will be set back to the remaining input data
length that not consumed by zlib. So if uncompressedBufLen > 0 after deflateBytesDirect()
invoked, a flag keepUncompressedBuf is setted true to indicate no input needed and compress()
should be invoked again to compress the remainling input data.

> hadoop zlib compression does not fully utilize the buffer
> ---------------------------------------------------------
>                 Key: HADOOP-6662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6662
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 0.20.2
>            Reporter: Xiao Kang
>         Attachments: ZlibCompressor.patch
> org.apache.hadoop.io.compress.ZlibCompressonr does not fully utilize its buffer. 
> Its needesInput() return false when there is any data in its buffer (64K by default).
The performance will greately degrade since an JNI call will be invoded at each time the write()
method of CompressonStream is called. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message