hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-538) Implement a nio's 'direct buffer' based wrapper over zlib to improve performance of java.util.zip.{De|In}flater as a 'custom codec'
Date Thu, 02 Nov 2006 16:42:18 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-538?page=comments#action_12446647 ] 
            
Arun C Murthy commented on HADOOP-538:
--------------------------------------

To further annotate my previous comment: I believe the perception today is that src/c++ contains
code to 'access' hadoop via C/C++; I think that is something worth carrying on, which is why
I'd put the native code which is 'core' hadoop in src/native.

A related problem which came up during a discussion with Owen: we cannot let a single 'core-hadoop'
library i.e. libhadoop.so have a direct dependency on libz.so (zlib shared-object)... this
is because in future we will then need libhadoop.so have a dependency on liblzo.so (lzo library)
and so on... the issue is that it will force people to install both libz.so and liblzo.so
even if they want to use only of of them.

Solutions:
a) Have multiple libhadoopzlib.so libhadooplzo.so and so on. This will mean we will have multiple
so's to track and load in hadoop, which could turn out to be a maintenece nightmare.
b) Supply stubs for each 'piece' i.e. zlib/lzo which have 'weak' symbols and let them get
overridden on loading the actual .so.
c) Ensure the native code doesn't actually do a -lz (or -lzo) in the Makefile, but rely on
dlopen/dlsym for necessary libz/liblzo calls. This means libhadoop.so doens't have a direct
dependency on either libz.so/liblzo.so, is still a single share-obj with all of hadoop's native
code and also lets people load whatever pieces of native code they like without paying for
the rest.

Given it's relative simplicity I'd go for option c) - it's easy and effective.

Thoughts?

> Implement a nio's 'direct buffer' based wrapper over zlib to improve performance of java.util.zip.{De|In}flater
as a 'custom codec'
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-538
>                 URL: http://issues.apache.org/jira/browse/HADOOP-538
>             Project: Hadoop
>          Issue Type: Improvement
>    Affects Versions: 0.6.1
>            Reporter: Arun C Murthy
>         Assigned To: Arun C Murthy
>             Fix For: 0.9.0
>
>         Attachments: HADOOP-538.patch, HADOOP-538_20061005.tgz, HADOOP-538_20061011.tgz,
HADOOP-538_20061026.tgz, HADOOP-538_20061030.tgz, HADOOP-538_benchmarks.tgz
>
>
> There has been more than one instance where java.util.zip's {De|In}flater classes perform
unreliably, a simple wrapper over zlib-1.2.3 (latest stable) using java.nio.ByteBuffer (i.e.
direct buffers) should go a long way in alleviating these woes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message