cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johan Elmerfjord <>
Subject 1.0.8 with Leveled compaction - Possible issues
Date Thu, 15 Mar 2012 22:05:16 GMT
Hi, I'm testing the community-version of Cassandra 1.0.8.
We are currently on 0.8.7 in our production-setup.

We have 3 Column Families that each takes between 20 and 35 GB on disk
per node. (8*2 nodes total)
We would like to change to Leveled Compaction - and even try compression
as well to reduce the space needed for compactions.
We are running on SSD-drives as latency is a key-issue.

As test I have imported one Column Family from 3 production-nodes to a 3
node test-cluster.
The data on the 3 nodes ranges from 19-33GB. (with at least one large
SSTable (Tiered size - recently compacted)).

After loading this data to the 3 test-nodes, and running scrub and
repair, I took a backup of the data so I have good test-set of data to
work on.
Then I changed changed to leveled compaction, using the cassandra-cli:

I could see the change being written to the logfile on all nodes.

Then I don't know for for sure if I need to run anything else to make
the change happen - or if it's just to wait.
My test-cluster does not receive new data.
For this  KS & CF and on each of the nodes I have tried some or several
of: upgradesstable, scrub, compact, cleanup and repair - each task
taking between 40 minutes and 4 hours.
With the exception of compact that returns almost immediately with no
visible compactions made.

On some node I ended up with over 30000 files with the default 5MB size
for leveled compaction, on another node it didn't look like anything has
been done and I still have a 19GB SSTable.

I then made another change.
compaction_strategy=LeveledCompactionStrategy AND
compaction_strategy_options=[{sstable_size_in_mb: 64}];
WARNING: [{}] strategy_options syntax is deprecated, please use {}
Which is probably wrong in the documentation - and should be:
compaction_strategy=LeveledCompactionStrategy AND
compaction_strategy_options={sstable_size_in_mb: 64};

I think that we will be able to find the data in 3 searches with a 64MB
size - and still only use around 700MB while doing compactions - and
keep the number of files ~3000 per CF.

A few days later it looks like I still have a mix between original huge
SStables, 5MB once - and some nodes has 64MB files as well.
Do I need to do something special to clean this up?
I have tried another scrub /upgradesstables/clean - but nothing seems to
do any change to me.

Finally I have also tried to enable compression:
- which results in the same [{}] - warning.

As you can see below - this created CompressionInfo.db - files on some
nodes - but not on all.

Is there a way I can force Teired sstables to be converted into Leveled
once - and then to compression as well?
Why are the original file (Tiered Sized SSTables still present on
testnode1 - when is it supposed to delete them?

Can I change back and forth between compression (on/off - or chunksizes)
- and between Leveled vs Size Tiered compaction?
Is there a way to see if the node is done - or waiting for something?
When is it safe to apply another setting - does it have to complete one
reorg before moving on to the next?

Any input or own experiences are warmly welcome.

Best regards, Johan

Some lines of example directory-listings below.:

Some files for testnode 3. (looks like it's still have the original Size
Tiered files around, and a mixture of compressed 64MB files - and 5MB

total 19G
drwxr-xr-x 3 cass cass 4.0K Mar 13 17:11 snapshots
-rw-r--r-- 1 cass cass 6.0G Mar 13 18:42 TestCF1-hc-6346-Index.db
-rw-r--r-- 1 cass cass 1.3M Mar 13 18:42 TestCF1-hc-6346-Filter.db
-rw-r--r-- 1 cass cass  13G Mar 13 18:42 TestCF1-hc-6346-Data.db
-rw-r--r-- 1 cass cass 2.4M Mar 13 18:42 TestCF1-hc-6346-CompressionInfo.db
-rw-r--r-- 1 cass cass 4.3K Mar 13 18:42 TestCF1-hc-6346-Statistics.db
-rw-r--r-- 1 cass cass 195K Mar 13 18:42 TestCF1-hc-6347-Filter.db
-rw-r--r-- 1 cass cass 4.9M Mar 13 18:42 TestCF1-hc-6347-Index.db
-rw-r--r-- 1 cass cass 9.0M Mar 13 18:42 TestCF1-hc-6347-Data.db
-rw-r--r-- 1 cass cass 4.3K Mar 13 18:42 TestCF1-hc-6347-Statistics.db
-rw-r--r-- 1 cass cass 2.0K Mar 13 18:42 TestCF1-hc-6347-CompressionInfo.db
-rw-r--r-- 1 cass cass  11K Mar 13 18:43 TestCF1-hc-6351-CompressionInfo.db
-rw-r--r-- 1 cass cass  52M Mar 13 18:43 TestCF1-hc-6351-Data.db
-rw-r--r-- 1 cass cass 1.1M Mar 13 18:43 TestCF1-hc-6351-Filter.db
-rw-r--r-- 1 cass cass  28M Mar 13 18:43 TestCF1-hc-6351-Index.db
-rw-r--r-- 1 cass cass 4.3K Mar 13 18:43 TestCF1-hc-6351-Statistics.db
-rw-r--r-- 1 cass cass  401 Mar 13 18:43 TestCF1.json
-rw-r--r-- 1 cass cass  950 Mar 13 18:43 TestCF1-hc-6350-CompressionInfo.db
-rw-r--r-- 1 cass cass 4.3M Mar 13 18:43 TestCF1-hc-6350-Data.db
-rw-r--r-- 1 cass cass  93K Mar 13 18:43 TestCF1-hc-6350-Filter.db
-rw-r--r-- 1 cass cass 2.3M Mar 13 18:43 TestCF1-hc-6350-Index.db
-rw-r--r-- 1 cass cass 4.3K Mar 13 18:43 TestCF1-hc-6350-Statistics.db
-rw-r--r-- 1 cass cass  400 Mar 13 18:43 TestCF1-old.json

Some DB-files ordered by size for testnode1:
No compressed files - but has the original Size-tiered files - as well
as 5MB Leveled compaction-files - but no 64 MB once.

total 83G
-rw-r--r-- 1 cass cass   33G Mar 13 07:14 TestCF1-hc-33504-Data.db
-rw-r--r-- 1 cass cass   11G Mar 13 07:14 TestCF1-hc-33504-Index.db
-rw-r--r-- 1 cass cass  407M Mar 13 07:14 TestCF1-hc-33504-Filter.db
-rw-r--r-- 1 cass cass  5.1M Mar 13 05:27 TestCF1-hc-33338-Data.db
-rw-r--r-- 1 cass cass  5.1M Mar 13 08:54 TestCF1-hc-38997-Data.db
-rw-r--r-- 1 cass cass  5.1M Mar 13 07:15 TestCF1-hc-33513-Data.db

Johan Elmerfjord | Sr. Systems Administration/Mgr, EMEA | Adobe Systems,
Product Technical Operations | p. +45 3231 6008 | x86008 | cell. +46 735
101 444 | 

View raw message