hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Evans <ev...@yahoo-inc.com>
Subject Re: Compressor tweaks corresponding to HDFS-2834, 3051?
Date Wed, 07 Mar 2012 15:16:17 GMT
I am a +1 on opening a new JIRA for a first stab at reducing the amount of data that gets copied

--Bobby Evans

On 3/7/12 1:26 AM, "Tim Broberg" <Tim.Broberg@exar.com> wrote:

In https://issues.apache.org/jira/browse/HDFS-2834, Todd says, "

  This is also useful whenever a native decompression codec is being used. In those cases,
we generally have the following copies:

  1) Socket -> DirectByteBuffer (in SocketChannel implementation)
  2) DirectByteBuffer -> byte[] (in SocketInputStream)
  3) byte[] -> Native buffer (set up for decompression)
  4*) decompression to a different native buffer (not really a copy - decompression necessarily
  5) native buffer -> byte[]

  with the proposed improvement we can hopefully eliminate #2,#3 for all applications, and
#2,#3,and #5 for libhdfs.

It seems like we need to tweak the Decompressor (and Compressor?) classes to take DirectByteBuffer
inputs / outputs rather than byte[]'s to support this improvement.

Is the right thing to do for me to open a jira in common for this and take a first stab at
defining the interface?

    - Tim.

The information and any attached documents contained in this message
may be confidential and/or legally privileged.  The message is
intended solely for the addressee(s).  If you are not the intended
recipient, you are hereby notified that any use, dissemination, or
reproduction is strictly prohibited and may be unlawful.  If you are
not the intended recipient, please contact the sender immediately by
return e-mail and destroy all copies of the original message.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message