cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tamar Fraenkel <ta...@tok-media.com>
Subject Re: compression
Date Wed, 24 Oct 2012 10:41:02 GMT
Hi!
I tried again, I see the scrub action in cassandra logs
 INFO [CompactionExecutor:4029] 2012-10-24 10:36:54,108
CompactionManager.java (line 476) Scrubbing
SSTableReader(path='/raid0/cassandra/data/tok/tk_usus_user-hc-339-Data.db')
 INFO [CompactionExecutor:4029] 2012-10-24 10:36:54,184
CompactionManager.java (line 658) Scrub of
SSTableReader(path='/raid0/cassandra/data/tok/tk_usus_user-hc-339-Data.db')
complete: 54 rows in new sstable and 0 empty (tombstoned) rows dropped
 INFO [CompactionExecutor:4029] 2012-10-24 10:36:54,185
CompactionManager.java (line 476) Scrubbing
SSTableReader(path='/raid0/cassandra/data/tok/tk_usus_user-hc-340-Data.db')
 INFO [CompactionExecutor:4029] 2012-10-24 10:36:54,914
CompactionManager.java (line 658) Scrub of
SSTableReader(path='/raid0/cassandra/data/tok/tk_usus_user-hc-340-Data.db')
complete: 7037 rows in new sstable and 0 empty (tombstoned) rows dropped

I don't see any CompressionInfo.db files and compression ratio is still 0.0
on this node only, on other nodes it is almost 0.5...

Any idea?

Thanks,

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

tamar@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Wed, Sep 26, 2012 at 3:40 AM, aaron morton <aaron@thelastpickle.com>wrote:

> Check the logs on  nodes 2 and 3 to see if the scrub started. The logs on
> 1 will be a good help with that.
>
> Cheers
>
>   -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 24/09/2012, at 10:31 PM, Tamar Fraenkel <tamar@tok-media.com> wrote:
>
> Hi!
> I ran
> UPDATE COLUMN FAMILY cf_name WITH
> compression_options={sstable_compression:SnappyCompressor,
> chunk_length_kb:64};
>
> I then ran on all my nodes (3)
> sudo nodetool -h localhost scrub tok cf_name
>
> I have replication factor 3. The size of the data on disk was cut in half
> in the first node and in the jmx I can see that indeed the compression
> ration is 0.46. But on nodes 2 and 3 nothing happened. In the jmx I can see
> that compression ratio is 0 and the size of the files of disk stayed the
> same.
>
> In cli
>
> ColumnFamily: cf_name
>       Key Validation Class: org.apache.cassandra.db.marshal.UUIDType
>       Default column value validator:
> org.apache.cassandra.db.marshal.UTF8Type
>       Columns sorted by:
> org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type)
>       Row cache size / save period in seconds / keys to save : 0.0/0/all
>       Row Cache Provider:
> org.apache.cassandra.cache.SerializingCacheProvider
>       Key cache size / save period in seconds: 200000.0/14400
>       GC grace seconds: 864000
>       Compaction min/max thresholds: 4/32
>       Read repair chance: 1.0
>       Replicate on write: true
>       Bloom Filter FP chance: default
>       Built indexes: []
>       Compaction Strategy:
> org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
>       Compression Options:
>         chunk_length_kb: 64
>         sstable_compression:
> org.apache.cassandra.io.compress.SnappyCompressor
>
> Can anyone help?
> Thanks
>
>  *Tamar Fraenkel *
> Senior Software Engineer, TOK Media
>
> <tokLogo.png>
>
>
> tamar@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956
>
>
>
>
>
> On Mon, Sep 24, 2012 at 8:37 AM, Tamar Fraenkel <tamar@tok-media.com>wrote:
>
>> Thanks all, that helps. Will start with one - two CFs and let you know
>> the effect
>>
>>
>> *Tamar Fraenkel *
>> Senior Software Engineer, TOK Media
>>
>> <tokLogo.png>
>>
>>
>> tamar@tok-media.com
>> Tel:   +972 2 6409736
>> Mob:  +972 54 8356490
>> Fax:   +972 2 5612956
>>
>>
>>
>>
>>
>> On Sun, Sep 23, 2012 at 8:21 PM, Hiller, Dean <Dean.Hiller@nrel.gov>wrote:
>>
>>> As well as your unlimited column names may all have the same prefix,
>>> right? Like "accounts".rowkey56, "accounts".rowkey78, etc. etc.  so the
>>> "accounts gets a ton of compression then.
>>>
>>> Later,
>>> Dean
>>>
>>> From: Tyler Hobbs <tyler@datastax.com<mailto:tyler@datastax.com>>
>>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
>>> <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>>> Date: Sunday, September 23, 2012 11:46 AM
>>> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <
>>> user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>>> Subject: Re: compression
>>>
>>>  column metadata, you're still likely to get a reasonable amount of
>>> compression.  This is especially true if there is some amount of repetition
>>> in the column names, values, or TTLs in wide rows.  Compression will almost
>>> always be beneficial unless you're already somehow CPU bound or are using
>>> large column values that are high in entropy, such as pre-compressed or
>>> encrypted data.
>>>
>>
>>
>
>

Mime
View raw message