incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tamar Fraenkel <ta...@tok-media.com>
Subject Re: compression
Date Fri, 28 Sep 2012 14:38:58 GMT
Hi!
The situation didn't resolve, does anyone has a clue?
Thanks

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

tamar@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Thu, Sep 27, 2012 at 10:42 AM, Tamar Fraenkel <tamar@tok-media.com>wrote:

> Hi!
> First, the problem is still there, altough I checked and all node agree on
> the schema.
> This is from ls -l
> Good Node
> -rw-r--r-- 1 cassandra cassandra        606 2012-09-27 08:01
> tk_usus_user-hc-269-CompressionInfo.db
> -rw-r--r-- 1 cassandra cassandra    2246431 2012-09-27 08:01
> tk_usus_user-hc-269-Data.db
> -rw-r--r-- 1 cassandra cassandra      11056 2012-09-27 08:01
> tk_usus_user-hc-269-Filter.db
> -rw-r--r-- 1 cassandra cassandra     129792 2012-09-27 08:01
> tk_usus_user-hc-269-Index.db
> -rw-r--r-- 1 cassandra cassandra       4336 2012-09-27 08:01
> tk_usus_user-hc-269-Statistics.db
>
> Node 2
> -rw-r--r-- 1 cassandra cassandra    4592393 2012-09-27 08:01
> tk_usus_user-hc-268-Data.db
> -rw-r--r-- 1 cassandra cassandra         69 2012-09-27 08:01
> tk_usus_user-hc-268-Digest.sha1
> -rw-r--r-- 1 cassandra cassandra      11056 2012-09-27 08:01
> tk_usus_user-hc-268-Filter.db
> -rw-r--r-- 1 cassandra cassandra     129792 2012-09-27 08:01
> tk_usus_user-hc-268-Index.db
> -rw-r--r-- 1 cassandra cassandra       4336 2012-09-27 08:01
> tk_usus_user-hc-268-Statistics.db
>
> Node 3
> -rw-r--r-- 1 cassandra cassandra   4592393 2012-09-27 08:01
> tk_usus_user-hc-278-Data.db
> -rw-r--r-- 1 cassandra cassandra        69 2012-09-27 08:01
> tk_usus_user-hc-278-Digest.sha1
> -rw-r--r-- 1 cassandra cassandra     11056 2012-09-27 08:01
> tk_usus_user-hc-278-Filter.db
> -rw-r--r-- 1 cassandra cassandra    129792 2012-09-27 08:01
> tk_usus_user-hc-278-Index.db
> -rw-r--r-- 1 cassandra cassandra      4336 2012-09-27 08:01
> tk_usus_user-hc-278-Statistics.db
>
> Looking at the logs, on the "good node" I can see
>
>  INFO [MigrationStage:1] 2012-09-24 10:08:16,511 Migration.java (line 119)
> Applying migration c22413b0-062f-11e2-0000-1bcb936807db Update column
> family to org.apache.cassandra.config.CFMetaData@1dbdcde9
> [cfId=1016,ksName=tok,cfName=tk_usus_user,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type),subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=1.0,replicateOnWrite=true,gcGraceSeconds=864000,defaultValidator=org.apache.cassandra.db.marshal.UTF8Type,keyValidator=org.apache.cassandra.db.marshal.UUIDType,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=14400,rowCacheKeysToSave=2147483647,rowCacheProvider=org.apache.cassandra.cache.SerializingCacheProvider@3505231c,mergeShardsChance=0.1,keyAlias=java.nio.HeapByteBuffer[pos=485
> lim=488 cap=653],column_metadata={},compactionStrategyClass=class
> org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,compactionStrategyOptions={},compressionOptions={sstable_compression=org.apache.cassandra.io.compress.SnappyCompressor,
> chunk_length_kb=64},bloomFilterFpChance=<null>]
>
> But same can be seen in the logs of the two other nodes:
>  INFO [MigrationStage:1] 2012-09-24 10:08:16,767 Migration.java (line 119)
> Applying migration c22413b0-062f-11e2-0000-1bcb936807db Update column
> family to org.apache.cassandra.config.CFMetaData@24fbb95d
> [cfId=1016,ksName=tok,cfName=tk_usus_user,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type),subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=1.0,replicateOnWrite=true,gcGraceSeconds=864000,defaultValidator=org.apache.cassandra.db.marshal.UTF8Type,keyValidator=org.apache.cassandra.db.marshal.UUIDType,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=14400,rowCacheKeysToSave=2147483647,rowCacheProvider=org.apache.cassandra.cache.SerializingCacheProvider@a469ba3,mergeShardsChance=0.1,keyAlias=java.nio.HeapByteBuffer[pos=0
> lim=3 cap=3],column_metadata={},compactionStrategyClass=class
> org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,compactionStrategyOptions={},compressionOptions={sstable_compression=org.apache.cassandra.io.compress.SnappyCompressor,
> chunk_length_kb=64},bloomFilterFpChance=<null>]
>
>  INFO [MigrationStage:1] 2012-09-24 10:08:16,705 Migration.java (line 119)
> Applying migration c22413b0-062f-11e2-0000-1bcb936807db Update column
> family to org.apache.cassandra.config.CFMetaData@216b6a58
> [cfId=1016,ksName=tok,cfName=tk_usus_user,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type),subcolumncomparator=<null>,comment=,rowCacheSize=0.0,keyCacheSize=200000.0,readRepairChance=1.0,replicateOnWrite=true,gcGraceSeconds=864000,defaultValidator=org.apache.cassandra.db.marshal.UTF8Type,keyValidator=org.apache.cassandra.db.marshal.UUIDType,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=14400,rowCacheKeysToSave=2147483647,rowCacheProvider=org.apache.cassandra.cache.SerializingCacheProvider@1312c88c,mergeShardsChance=0.1,keyAlias=java.nio.HeapByteBuffer[pos=0
> lim=3 cap=3],column_metadata={},compactionStrategyClass=class
> org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,compactionStrategyOptions={},compressionOptions={sstable_compression=org.apache.cassandra.io.compress.SnappyCompressor,
> chunk_length_kb=64},bloomFilterFpChance=<null>]
>
>
> I can also see scrub messages in logs
> Good node:
>  INFO [CompactionExecutor:1774] 2012-09-24 10:09:05,402
> CompactionManager.java (line 476) Scrubbing
> SSTableReader(path='/raid0/cassandra/data/tok/tk_usus_user-hc-264-Data.db')
>  INFO [CompactionExecutor:1774] 2012-09-24 10:09:05,934
> CompactionManager.java (line 658) Scrub of
> SSTableReader(path='/raid0/cassandra/data/tok/tk_usus_user-hc-264-Data.db')
> complete: 4868 rows in new sstable and 0 empty (tombstoned) rows dropped
>
> Other nodes
>
>  INFO [CompactionExecutor:1800] 2012-09-24 10:09:11,789
> CompactionManager.java (line 476) Scrubbing
> SSTableReader(path='/raid0/cassandra/data/tok/tk_usus_user-hc-260-Data.db')
>  INFO [CompactionExecutor:1800] 2012-09-24 10:09:12,464
> CompactionManager.java (line 658) Scrub of
> SSTableReader(path='/raid0/cassandra/data/tok/tk_usus_user-hc-260-Data.db')
> complete: 4868 rows in new sstable and 0 empty (tombstoned) rows dropped
>
>  INFO [CompactionExecutor:1687] 2012-09-24 10:09:16,235
> CompactionManager.java (line 476) Scrubbing
> SSTableReader(path='/raid0/cassandra/data/tok/tk_usus_user-hc-271-Data.db')
>  INFO [CompactionExecutor:1687] 2012-09-24 10:09:16,898
> CompactionManager.java (line 658) Scrub of
> SSTableReader(path='/raid0/cassandra/data/tok/tk_usus_user-hc-271-Data.db')
> compete: 4868 rows in new sstable and 0 empty (tombstoned) rows dropped
>
> Any idea?
> Thanks!!
>
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media
>
> [image: Inline image 1]
>
>
> tamar@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956
>
>
>
>
>
> On Wed, Sep 26, 2012 at 3:40 AM, aaron morton <aaron@thelastpickle.com>wrote:
>
>> Check the logs on  nodes 2 and 3 to see if the scrub started. The logs on
>> 1 will be a good help with that.
>>
>> Cheers
>>
>>   -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 24/09/2012, at 10:31 PM, Tamar Fraenkel <tamar@tok-media.com> wrote:
>>
>> Hi!
>> I ran
>> UPDATE COLUMN FAMILY cf_name WITH
>> compression_options={sstable_compression:SnappyCompressor,
>> chunk_length_kb:64};
>>
>> I then ran on all my nodes (3)
>> sudo nodetool -h localhost scrub tok cf_name
>>
>> I have replication factor 3. The size of the data on disk was cut in half
>> in the first node and in the jmx I can see that indeed the compression
>> ration is 0.46. But on nodes 2 and 3 nothing happened. In the jmx I can see
>> that compression ratio is 0 and the size of the files of disk stayed the
>> same.
>>
>> In cli
>>
>> ColumnFamily: cf_name
>>       Key Validation Class: org.apache.cassandra.db.marshal.UUIDType
>>       Default column value validator:
>> org.apache.cassandra.db.marshal.UTF8Type
>>       Columns sorted by:
>> org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type)
>>       Row cache size / save period in seconds / keys to save : 0.0/0/all
>>       Row Cache Provider:
>> org.apache.cassandra.cache.SerializingCacheProvider
>>       Key cache size / save period in seconds: 200000.0/14400
>>       GC grace seconds: 864000
>>       Compaction min/max thresholds: 4/32
>>       Read repair chance: 1.0
>>       Replicate on write: true
>>       Bloom Filter FP chance: default
>>       Built indexes: []
>>       Compaction Strategy:
>> org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
>>       Compression Options:
>>         chunk_length_kb: 64
>>         sstable_compression:
>> org.apache.cassandra.io.compress.SnappyCompressor
>>
>> Can anyone help?
>> Thanks
>>
>>  *Tamar Fraenkel *
>> Senior Software Engineer, TOK Media
>>
>> <tokLogo.png>
>>
>>
>> tamar@tok-media.com
>> Tel:   +972 2 6409736
>> Mob:  +972 54 8356490
>> Fax:   +972 2 5612956
>>
>>
>>
>>
>>
>> On Mon, Sep 24, 2012 at 8:37 AM, Tamar Fraenkel <tamar@tok-media.com>wrote:
>>
>>> Thanks all, that helps. Will start with one - two CFs and let you know
>>> the effect
>>>
>>>
>>> *Tamar Fraenkel *
>>> Senior Software Engineer, TOK Media
>>>
>>> <tokLogo.png>
>>>
>>>
>>> tamar@tok-media.com
>>> Tel:   +972 2 6409736
>>> Mob:  +972 54 8356490
>>> Fax:   +972 2 5612956
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Sep 23, 2012 at 8:21 PM, Hiller, Dean <Dean.Hiller@nrel.gov>wrote:
>>>
>>>> As well as your unlimited column names may all have the same prefix,
>>>> right? Like "accounts".rowkey56, "accounts".rowkey78, etc. etc.  so the
>>>> "accounts gets a ton of compression then.
>>>>
>>>> Later,
>>>> Dean
>>>>
>>>> From: Tyler Hobbs <tyler@datastax.com<mailto:tyler@datastax.com>>
>>>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
>>>> <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>>>> Date: Sunday, September 23, 2012 11:46 AM
>>>> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <
>>>> user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>>>> Subject: Re: compression
>>>>
>>>>  column metadata, you're still likely to get a reasonable amount of
>>>> compression.  This is especially true if there is some amount of repetition
>>>> in the column names, values, or TTLs in wide rows.  Compression will almost
>>>> always be beneficial unless you're already somehow CPU bound or are using
>>>> large column values that are high in entropy, such as pre-compressed or
>>>> encrypted data.
>>>>
>>>
>>>
>>
>>
>

Mime
View raw message