hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <>
Subject [jira] Commented: (HIVE-1838) Add quickLZ compression codec for Hive.
Date Wed, 08 Dec 2010 02:21:01 GMT


He Yongqiang commented on HIVE-1838:

No. I mean compression codec for Hive. It could be used to compress intermediate data.

Here are some results:

5. Hadoop compression with native library (COMPRESSLEVEL=BEST_SPEED)
time java -Djava.library.path=/data/users/heyongqiang/hadoop-0.20/build/native/Linux-amd64-64/lib/

real	0m34.179s
user	0m29.031s
sys	0m1.607s

compressed size: 275M

6. LZF
[heyongqiang@dev782 compress_test]$ time lzf -c 000000_0 

real	0m39.031s
user	0m8.727s
sys	0m2.231s
compressed size: 393M

7. FastLZ
time fastlz/6pack -1 000000_0 000000_0.fastlz
real	0m19.020s
user	0m18.083s
sys	0m0.935s

compressed size: 391M

time ./compress_file ../000000_0 ../000000_0.quicklz

real	0m15.652s
user	0m14.047s
sys	0m1.603s

compressed size: 334M

I modified QuickLZ's compress_file code to use a buffer for fairness. It turns out the result
is very close to FastLZ. The modified version of QuickLZ is just one second better.

> Add quickLZ compression codec for Hive.
> ---------------------------------------
>                 Key: HIVE-1838
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message