hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: Documenting Guidance on compression and codecs
Date Thu, 19 Sep 2013 03:34:59 GMT
Do you have any numbers on compression speed, too?
I continue to be surprised by the relative compression ratios between LZ4, LZO, and SNAPPY.
I had expected SNAPPY and LZO to be roughly equivalent and LZ4 to be far better than LZO.

-- Lars



________________________________
 From: Nick Dimiduk <ndimiduk@gmail.com>
To: hbase-dev <dev@hbase.apache.org> 
Sent: Wednesday, September 18, 2013 5:19 PM
Subject: Re: Documenting Guidance on compression and codecs
 

For completeness, here's an entry for LZ4:

+--------------------+--------------+
| compression:LZ4    |    391017061 |
+--------------------+--------------+



On Wed, Sep 11, 2013 at 12:10 PM, Nick Dimiduk <ndimiduk@gmail.com> wrote:

> Do we have a consolidated resource with information and recommendations
> about use of the above? For instance, I ran a simple test using
> PerformanceEvaluation, examining just the size of data on disk for 1G of
> input data. The matrix below has some surprising results:
>
> +--------------------+--------------+
> | MODIFIER           | SIZE (bytes) |
> +--------------------+--------------+
> | none               |   1108553612 |
> +--------------------+--------------+
> | compression:SNAPPY |    427335534 |
> +--------------------+--------------+
> | compression:LZO    |    270422088 |
> +--------------------+--------------+
> | compression:GZ     |    152899297 |
> +--------------------+--------------+
> | codec:PREFIX       |   1993910969 |
> +--------------------+--------------+
> | codec:DIFF         |   1960970083 |
> +--------------------+--------------+
> | codec:FAST_DIFF    |   1061374722 |
> +--------------------+--------------+
> | codec:PREFIX_TREE  |   1066586604 |
> +--------------------+--------------+
>
> Where does a wayward soul look for guidance on which combination of the
> above to choose for their application?
>
> Thanks,
> Nick
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message