cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Minh Do <...@netflix.com.INVALID>
Subject Re: Cassandra decompress already compressed data
Date Tue, 21 Feb 2017 20:13:15 GMT
Hi Alive,

I think your observation is aligned to what I saw earlier during my
debugging session.

What you have is the compression in the wire between Java Driver and the C*
Server.

C* storage has a different flag and mechanism to compress its stored data
in the SSTable files
and as you can see, it is set at the table level.

I think C* server needs to decompress the compressed data from the socket
to understand the data before processing it.





On Tue, Feb 21, 2017 at 11:59 AM, Мириан Джачвадзе <alivesubstance@gmail.com
> wrote:

> The question related to C* driver and C* server itself. I'll ask it here
> and at C* driver developers group too. Hope somebody can give me an answer
> or maybe give an explanation why it happens as it happens.
>
>
> This is the story about cassandra 2.2.7(I've checked it with the C* 3.* and
> result is the same) and driver 3.0.2.
>
>
> I am playing with blob in this table:
>
> CREATE TABLE compression_test (
>
>     id uuid,
>
>     chunk int,
>
>     data blob,
>
>     PRIMARY KEY (id, chunk))
>
> WITH compression = { 'sstable_compression' : '' }"
>
>
> Java cassandra driver has a compression option out of the box. It can
> compress data before send it to socket. I am thinking that driver compress
> source byte array, send it to C* and after processing it will be stored as
> is to SSTable. That is why the table configured without compression on
> SSTable level. But it's only my proposal. During investigation I found out
> that I was wrong.
>
>
> In my case I choose LZ4 compression in driver and 0.5G data was compressed
> down to 40Mb. Before send data to socket driver compress it, set
> COMPRESSED flag
> at the beginning of the compressed array and sent to C*  server. Through
> debug I see that exactly 40Mb was written to socket.
>
>
> Meanwhile C* server has Server.Initial class that initialize netty
> ChannelPipelineand append severalChannelHandler's to it. One of the
> ChannelHandleris Frame.Decompressor. It check that given frame has a
> COMPRESSED flag and decompress data in case it exist.  Also debugging I see
> that C* received exactly 40Mb chunk, find COMPRESSED flag, decompress it
> and process it. It leads to extra memory and disk consumption. I'm not sure
> how it's stored in mem table but absolutely sure that 0.5G of decompressed
> data is stored on disk driver. And it's not what I want. I can set LZ4
> compressor for SSTable and the received chunk will be stored compressed but
> it's also not what I want.
>
>
> For what purpose C* decompress already compressed data? I am miss something
> in C* or driver configuration?
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message