hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <>
Subject Re: Parquet tables with snappy compression
Date Wed, 25 Jan 2017 22:13:20 GMT

> Has there been any study of how much compressing Hive Parquet tables with snappy reduces
storage space or simply the table size in quantitative terms?

Since SNAPPY is just LZ77, I would assume it would be useful in cases of Parquet leaves containing
text with large common sub-chunks (like URLs or log data).

If you want to experiment with that corner case, the L_COMMENT field from TPC-H lineitem is
a good compression-thrasher.


View raw message