incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: compression
Date Mon, 29 Oct 2012 23:50:22 GMT
>  Any clue how to avoid it?
Not really sure what went wrong. Diagnosing that sort of problem usually takes access to the
running node and time to poke around and see what it does in responses to various things.


Rebooting works for Windows 95 and Cassandra is not that different. 

Cheers
 
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 29/10/2012, at 9:12 PM, Tamar Fraenkel <tamar@tok-media.com> wrote:

> Hi!
> Thanks Aaron!
> Today I restarted Cassandra on that node and ran scrub again, now it is fine.
> 
> I am worried though that if I decide to change another CF to use compression I will have
that issue again. Any clue how to avoid it?
> 
> Thanks.
> 
> Tamar Fraenkel 
> Senior Software Engineer, TOK Media 
> 
> <tokLogo.png>
> 
> tamar@tok-media.com
> Tel:   +972 2 6409736 
> Mob:  +972 54 8356490 
> Fax:   +972 2 5612956 
> 
> 
> 
> 
> 
> On Wed, Sep 26, 2012 at 3:40 AM, aaron morton <aaron@thelastpickle.com> wrote:
> Check the logs on  nodes 2 and 3 to see if the scrub started. The logs on 1 will be a
good help with that. 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 24/09/2012, at 10:31 PM, Tamar Fraenkel <tamar@tok-media.com> wrote:
> 
>> Hi!
>> I ran 
>> UPDATE COLUMN FAMILY cf_name WITH compression_options={sstable_compression:SnappyCompressor,
chunk_length_kb:64};
>> 
>> I then ran on all my nodes (3)
>> sudo nodetool -h localhost scrub tok cf_name
>> 
>> I have replication factor 3. The size of the data on disk was cut in half in the
first node and in the jmx I can see that indeed the compression ration is 0.46. But on nodes
2 and 3 nothing happened. In the jmx I can see that compression ratio is 0 and the size of
the files of disk stayed the same.
>> 
>> In cli 
>> 
>> ColumnFamily: cf_name
>>       Key Validation Class: org.apache.cassandra.db.marshal.UUIDType
>>       Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
>>       Columns sorted by: org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type)
>>       Row cache size / save period in seconds / keys to save : 0.0/0/all
>>       Row Cache Provider: org.apache.cassandra.cache.SerializingCacheProvider
>>       Key cache size / save period in seconds: 200000.0/14400
>>       GC grace seconds: 864000
>>       Compaction min/max thresholds: 4/32
>>       Read repair chance: 1.0
>>       Replicate on write: true
>>       Bloom Filter FP chance: default
>>       Built indexes: []
>>       Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
>>       Compression Options:
>>         chunk_length_kb: 64
>>         sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
>> 
>> Can anyone help?
>> Thanks
>> 
>> Tamar Fraenkel 
>> Senior Software Engineer, TOK Media 
>> 
>> <tokLogo.png>
>> 
>> 
>> tamar@tok-media.com
>> Tel:   +972 2 6409736 
>> Mob:  +972 54 8356490 
>> Fax:   +972 2 5612956 
>> 
>> 
>> 
>> 
>> 
>> On Mon, Sep 24, 2012 at 8:37 AM, Tamar Fraenkel <tamar@tok-media.com> wrote:
>> Thanks all, that helps. Will start with one - two CFs and let you know the effect
>> 
>> 
>> Tamar Fraenkel 
>> Senior Software Engineer, TOK Media 
>> 
>> <tokLogo.png>
>> 
>> 
>> tamar@tok-media.com
>> Tel:   +972 2 6409736 
>> Mob:  +972 54 8356490 
>> Fax:   +972 2 5612956 
>> 
>> 
>> 
>> 
>> 
>> On Sun, Sep 23, 2012 at 8:21 PM, Hiller, Dean <Dean.Hiller@nrel.gov> wrote:
>> As well as your unlimited column names may all have the same prefix, right? Like
"accounts".rowkey56, "accounts".rowkey78, etc. etc.  so the "accounts gets a ton of compression
then.
>> 
>> Later,
>> Dean
>> 
>> From: Tyler Hobbs <tyler@datastax.com<mailto:tyler@datastax.com>>
>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>> Date: Sunday, September 23, 2012 11:46 AM
>> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>> Subject: Re: compression
>> 
>>  column metadata, you're still likely to get a reasonable amount of compression.
 This is especially true if there is some amount of repetition in the column names, values,
or TTLs in wide rows.  Compression will almost always be beneficial unless you're already
somehow CPU bound or are using large column values that are high in entropy, such as pre-compressed
or encrypted data.
>> 
>> 
> 
> 


Mime
View raw message