orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <gop...@apache.org>
Subject Re: How to make ORC use libz.so instead of libzip.so
Date Thu, 07 Feb 2019 17:23:04 GMT
    
>    We are conducting a project involving replacing (Linux) system's
>    libz.so with our own hardware based implementation, but this requires us to 
>    replace libzip.so with our own so that small zip processing doesn't go through 
>    hardware, as hardware actually cannot process these requests correctly due to 
>    structural differences between hardware and software implementations of the 
>    deflate algorithm. 

You're hitting a JDK8 & below limitation.

https://bugs.openjdk.java.net/browse/JDK-8079759 -> https://bugs.openjdk.java.net/browse/JDK-8031767
-> https://bugs.openjdk.java.net/browse/JDK-8176343

I've got a similar TODO sitting on my backburner, waiting for hardware access to test.

POWER9 NX 842 is my target for optimizing this & all the kernel bits for this are already
shipped in Linux.

ORC is actually a bit hard-tuned for x86_64 Zlib performance - the different columns use different
levels & strategies, which worked well on libzip.

hive.exec.orc.encoding.strategy & hive.exec.orc.compression.strategy are set to SPEED
to allow for standard Zlib to be good enough.

HWAccel might mean that the COMPRESSION mode for both might not produce a performance hit
(& in fact, might be faster due to lower bandwidth for blocks both ways over the bus).

Cheers,
Gopal




Mime
View raw message