incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: compression
Date Wed, 26 Sep 2012 01:40:21 GMT
Check the logs on  nodes 2 and 3 to see if the scrub started. The logs on 1 will be a good
help with that. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 24/09/2012, at 10:31 PM, Tamar Fraenkel <tamar@tok-media.com> wrote:

> Hi!
> I ran 
> UPDATE COLUMN FAMILY cf_name WITH compression_options={sstable_compression:SnappyCompressor,
chunk_length_kb:64};
> 
> I then ran on all my nodes (3)
> sudo nodetool -h localhost scrub tok cf_name
> 
> I have replication factor 3. The size of the data on disk was cut in half in the first
node and in the jmx I can see that indeed the compression ration is 0.46. But on nodes 2 and
3 nothing happened. In the jmx I can see that compression ratio is 0 and the size of the files
of disk stayed the same.
> 
> In cli 
> 
> ColumnFamily: cf_name
>       Key Validation Class: org.apache.cassandra.db.marshal.UUIDType
>       Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
>       Columns sorted by: org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type)
>       Row cache size / save period in seconds / keys to save : 0.0/0/all
>       Row Cache Provider: org.apache.cassandra.cache.SerializingCacheProvider
>       Key cache size / save period in seconds: 200000.0/14400
>       GC grace seconds: 864000
>       Compaction min/max thresholds: 4/32
>       Read repair chance: 1.0
>       Replicate on write: true
>       Bloom Filter FP chance: default
>       Built indexes: []
>       Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
>       Compression Options:
>         chunk_length_kb: 64
>         sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
> 
> Can anyone help?
> Thanks
> 
> Tamar Fraenkel 
> Senior Software Engineer, TOK Media 
> 
> <tokLogo.png>
> 
> tamar@tok-media.com
> Tel:   +972 2 6409736 
> Mob:  +972 54 8356490 
> Fax:   +972 2 5612956 
> 
> 
> 
> 
> 
> On Mon, Sep 24, 2012 at 8:37 AM, Tamar Fraenkel <tamar@tok-media.com> wrote:
> Thanks all, that helps. Will start with one - two CFs and let you know the effect
> 
> 
> Tamar Fraenkel 
> Senior Software Engineer, TOK Media 
> 
> <tokLogo.png>
> 
> tamar@tok-media.com
> Tel:   +972 2 6409736 
> Mob:  +972 54 8356490 
> Fax:   +972 2 5612956 
> 
> 
> 
> 
> 
> On Sun, Sep 23, 2012 at 8:21 PM, Hiller, Dean <Dean.Hiller@nrel.gov> wrote:
> As well as your unlimited column names may all have the same prefix, right? Like "accounts".rowkey56,
"accounts".rowkey78, etc. etc.  so the "accounts gets a ton of compression then.
> 
> Later,
> Dean
> 
> From: Tyler Hobbs <tyler@datastax.com<mailto:tyler@datastax.com>>
> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Date: Sunday, September 23, 2012 11:46 AM
> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Subject: Re: compression
> 
>  column metadata, you're still likely to get a reasonable amount of compression.  This
is especially true if there is some amount of repetition in the column names, values, or TTLs
in wide rows.  Compression will almost always be beneficial unless you're already somehow
CPU bound or are using large column values that are high in entropy, such as pre-compressed
or encrypted data.
> 
> 


Mime
View raw message