hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthick Sankarachary (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3732) New configuration option for client-side compression
Date Mon, 09 May 2011 19:45:06 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030872#comment-13030872
] 

Karthick Sankarachary commented on HBASE-3732:
----------------------------------------------

bq. Not so sure about that. I think that are many easy ways to solve this, but most of them
include polluting the API or doing weird acrobatics in the client. Compressing/decompressing
is easy, it's all about where you're going to do it in the code.

As Stack suggested, a compression bit to KeyValue#Type, say Compressed(128), can be used to
tell if a value is compressed or not. Alternatively, we could define a Type#PutCompressed
value, and have the server handle that the same way as Type#Put. The actual compression (decompression)
would occur in the Put (Result) depending on the client-side compression algorithm. For all
intents and purposes, this change would be transparent to the end user.

> New configuration option for client-side compression
> ----------------------------------------------------
>
>                 Key: HBASE-3732
>                 URL: https://issues.apache.org/jira/browse/HBASE-3732
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.92.0
>
>         Attachments: compressed_streams.jar
>
>
> We have a case here where we have to store very fat cells (arrays of integers) which
can amount into the hundreds of KBs that we need to read often, concurrently, and possibly
keep in cache. Compressing the values on the client using java.util.zip's Deflater before
sending them to HBase proved to be in our case almost an order of magnitude faster.
> There reasons are evident: less data sent to hbase, memstore contains compressed data,
block cache contains compressed data too, etc.
> I was thinking that it might be something useful to add to a family schema, so that Put/Result
do the conversion for you. The actual compression algo should also be configurable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message