cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Watanabe Maki <watanabe.m...@gmail.com>
Subject Re: 1.0.8 with Leveled compaction - Possible issues
Date Fri, 16 Mar 2012 22:44:00 GMT
The Cassandra team has been released new version every month last half year.
So I anticipate they will release 1.0.9 before April. Just my forecast:-)

maki


On 2012/03/16, at 22:41, Johan Elmerfjord <jelmerfj@adobe.com> wrote:

> Perfect.. this helped a lot - and I can confirm that I have run in to the same issue
as described in:
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201203.mbox/%3CCALqbeQbQ=d-hORVhA-LHOo_a5j46fQrsZMm+OQgfkgR=4RRQJQ@mail.gmail.com%3E
> 
> Where it goes down when it tries to move up files to a higher level - that is out of
bounds.
> 
> Nice that I could get a overview of the levels by looking in the .json-file as well.
> 
> Any timeframe on when we can expect 1.0.9 to be released?
> 
> /Johan
> 
> 
> -- 
>   
> Johan Elmerfjord | Sr. Systems Administration/Mgr, EMEA | Adobe Systems, Product Technical
Operations | p. +45 3231 6008 | x86008 | cell. +46 735 101 444 | Jelmerfj@adobe.com 
> 
> On Thu, 2012-03-15 at 17:00 -0700, Watanabe Maki wrote:
>> 
>> update column family with LCS option + upgradesstables should convert all of your
sstables.
>> Set lig4j config:
>> org.apache.cassandra.db.compaction=DEBUG
>> in conf/log4j-server.properties and retry your procedure to find what is happen.
>> 
>> 
>> maki
>> 
>> 
>> 
>> On 2012/03/16, at 7:05, Johan Elmerfjord <jelmerfj@adobe.com> wrote:
>> 
>> 
>> 
>>> Hi, I'm testing the community-version of Cassandra 1.0.8.
>>> We are currently on 0.8.7 in our production-setup.
>>> 
>>> We have 3 Column Families that each takes between 20 and 35 GB on disk per node.
(8*2 nodes total)
>>> We would like to change to Leveled Compaction - and even try compression as well
to reduce the space needed for compactions.
>>> We are running on SSD-drives as latency is a key-issue.
>>> 
>>> As test I have imported one Column Family from 3 production-nodes to a 3 node
test-cluster.
>>> The data on the 3 nodes ranges from 19-33GB. (with at least one large SSTable
(Tiered size - recently compacted)).
>>> 
>>> After loading this data to the 3 test-nodes, and running scrub and repair, I
took a backup of the data so I have good test-set of data to work on.
>>> Then I changed changed to leveled compaction, using the cassandra-cli:
>>> 
>>> UPDATE COLUMN FAMILY TestCF1 WITH compaction_strategy=LeveledCompactionStrategy;
>>> I could see the change being written to the logfile on all nodes.
>>> 
>>> Then I don't know for for sure if I need to run anything else to make the change
happen - or if it's just to wait.
>>> My test-cluster does not receive new data.
>>> 
>>> For this  KS & CF and on each of the nodes I have tried some or several of:
upgradesstable, scrub, compact, cleanup and repair - each task taking between 40 minutes and
4 hours.
>>> With the exception of compact that returns almost immediately with no visible
compactions made.
>>> 
>>> On some node I ended up with over 30000 files with the default 5MB size for leveled
compaction, on another node it didn't look like anything has been done and I still have a
19GB SSTable.
>>> 
>>> I then made another change.
>>> UPDATE COLUMN FAMILY TestCF1 WITH compaction_strategy=LeveledCompactionStrategy
AND compaction_strategy_options=[{sstable_size_in_mb: 64}];
>>> WARNING: [{}] strategy_options syntax is deprecated, please use {}
>>> Which is probably wrong in the documentation - and should be:
>>> UPDATE COLUMN FAMILY TestCF1 WITH compaction_strategy=LeveledCompactionStrategy
AND compaction_strategy_options={sstable_size_in_mb: 64};
>>> 
>>> I think that we will be able to find the data in 3 searches with a 64MB size
- and still only use around 700MB while doing compactions - and keep the number of files ~3000
per CF.
>>> 
>>> A few days later it looks like I still have a mix between original huge SStables,
5MB once - and some nodes has 64MB files as well.
>>> Do I need to do something special to clean this up?
>>> I have tried another scrub /upgradesstables/clean - but nothing seems to do any
change to me.
>>> 
>>> Finally I have also tried to enable compression:
>>> UPDATE COLUMN FAMILY TestCF1 WITH compression_options=[{sstable_compression:SnappyCompressor,
chunk_length_kb:64}];
>>> - which results in the same [{}] - warning.
>>> 
>>> As you can see below - this created CompressionInfo.db - files on some nodes
- but not on all.
>>> 
>>> Is there a way I can force Teired sstables to be converted into Leveled once
- and then to compression as well?
>>> Why are the original file (Tiered Sized SSTables still present on testnode1 -
when is it supposed to delete them?
>>> 
>>> Can I change back and forth between compression (on/off - or chunksizes) - and
between Leveled vs Size Tiered compaction?
>>> Is there a way to see if the node is done - or waiting for something?
>>> When is it safe to apply another setting - does it have to complete one reorg
before moving on to the next?
>>> 
>>> Any input or own experiences are warmly welcome.
>>> 
>>> Best regards, Johan
>>> 
>>> 
>>> Some lines of example directory-listings below.:
>>> 
>>> Some files for testnode 3. (looks like it's still have the original Size Tiered
files around, and a mixture of compressed 64MB files - and 5MB files?
>>> total 19G
>>> drwxr-xr-x 3 cass cass 4.0K Mar 13 17:11 snapshots
>>> -rw-r--r-- 1 cass cass 6.0G Mar 13 18:42 TestCF1-hc-6346-Index.db
>>> -rw-r--r-- 1 cass cass 1.3M Mar 13 18:42 TestCF1-hc-6346-Filter.db
>>> -rw-r--r-- 1 cass cass  13G Mar 13 18:42 TestCF1-hc-6346-Data.db
>>> -rw-r--r-- 1 cass cass 2.4M Mar 13 18:42 TestCF1-hc-6346-CompressionInfo.db
>>> -rw-r--r-- 1 cass cass 4.3K Mar 13 18:42 TestCF1-hc-6346-Statistics.db
>>> -rw-r--r-- 1 cass cass 195K Mar 13 18:42 TestCF1-hc-6347-Filter.db
>>> -rw-r--r-- 1 cass cass 4.9M Mar 13 18:42 TestCF1-hc-6347-Index.db
>>> -rw-r--r-- 1 cass cass 9.0M Mar 13 18:42 TestCF1-hc-6347-Data.db
>>> -rw-r--r-- 1 cass cass 4.3K Mar 13 18:42 TestCF1-hc-6347-Statistics.db
>>> -rw-r--r-- 1 cass cass 2.0K Mar 13 18:42 TestCF1-hc-6347-CompressionInfo.db
>>> -rw-r--r-- 1 cass cass  11K Mar 13 18:43 TestCF1-hc-6351-CompressionInfo.db
>>> -rw-r--r-- 1 cass cass  52M Mar 13 18:43 TestCF1-hc-6351-Data.db
>>> -rw-r--r-- 1 cass cass 1.1M Mar 13 18:43 TestCF1-hc-6351-Filter.db
>>> -rw-r--r-- 1 cass cass  28M Mar 13 18:43 TestCF1-hc-6351-Index.db
>>> -rw-r--r-- 1 cass cass 4.3K Mar 13 18:43 TestCF1-hc-6351-Statistics.db
>>> -rw-r--r-- 1 cass cass  401 Mar 13 18:43 TestCF1.json
>>> -rw-r--r-- 1 cass cass  950 Mar 13 18:43 TestCF1-hc-6350-CompressionInfo.db
>>> -rw-r--r-- 1 cass cass 4.3M Mar 13 18:43 TestCF1-hc-6350-Data.db
>>> -rw-r--r-- 1 cass cass  93K Mar 13 18:43 TestCF1-hc-6350-Filter.db
>>> -rw-r--r-- 1 cass cass 2.3M Mar 13 18:43 TestCF1-hc-6350-Index.db
>>> -rw-r--r-- 1 cass cass 4.3K Mar 13 18:43 TestCF1-hc-6350-Statistics.db
>>> -rw-r--r-- 1 cass cass  400 Mar 13 18:43 TestCF1-old.json
>>> 
>>> 
>>> Some DB-files ordered by size for testnode1:
>>> No compressed files - but has the original Size-tiered files - as well as 5MB
Leveled compaction-files - but no 64 MB once.
>>> total 83G
>>> -rw-r--r-- 1 cass cass   33G Mar 13 07:14 TestCF1-hc-33504-Data.db
>>> -rw-r--r-- 1 cass cass   11G Mar 13 07:14 TestCF1-hc-33504-Index.db
>>> -rw-r--r-- 1 cass cass  407M Mar 13 07:14 TestCF1-hc-33504-Filter.db
>>> -rw-r--r-- 1 cass cass  5.1M Mar 13 05:27 TestCF1-hc-33338-Data.db
>>> -rw-r--r-- 1 cass cass  5.1M Mar 13 08:54 TestCF1-hc-38997-Data.db
>>> -rw-r--r-- 1 cass cass  5.1M Mar 13 07:15 TestCF1-hc-33513-Data.db
>>> 
>>> 
>>> -- 
>>>   
>>> Johan Elmerfjord | Sr. Systems Administration/Mgr, EMEA | Adobe Systems, Product
Technical Operations | p. +45 3231 6008 | x86008 | cell. +46 735 101 444 | Jelmerfj@adobe.com

>>> 
>>> 

Mime
View raw message