incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas van Neerijnen <...@bossastudios.com>
Subject Re: 1.0.8 with Leveled compaction - Possible issues
Date Thu, 15 Mar 2012 22:39:39 GMT
Heya

I'd suggest staying away from Leveled Compaction until 1.0.9.
For the why see this great explanation I got from Maki Watanabe on the
list:
http://mail-archives.apache.org/mod_mbox/cassandra-user/201203.mbox/%3CCALqbeQbQ=d-hORVhA-LHOo_a5j46fQrsZMm+OQgfkgR=4RRQJQ@mail.gmail.com%3E
Keep an eye on that one because I'm busy testing one of his suggestions,
I'll post back with the results soon.

My understanding is after a change in compaction or compression, until you
run an upgradesstables on all the nodes the current sstables will have the
old schema settings, only new ones get the new format. Obviously this
compounds the issue I mentioned above tho.
Be warned, an upgradesstables can take a long time so maybe keep an eye on
the number of files around vs over 5MB to get an idea of progress. Maybe
someone else knows a better way?

You can change back and forth between compression and compaction options
quite safely, but again you need an upgradesstables to remove it from
current sstables.

In my experience I've safely applied compression and leveled compaction to
the same CF at the same time without issue so I guess it's ok:)

On Thu, Mar 15, 2012 at 10:05 PM, Johan Elmerfjord <jelmerfj@adobe.com>wrote:

> **
> Hi, I'm testing the community-version of Cassandra 1.0.8.
> We are currently on 0.8.7 in our production-setup.
>
> We have 3 Column Families that each takes between 20 and 35 GB on disk per
> node. (8*2 nodes total)
> We would like to change to Leveled Compaction - and even try compression
> as well to reduce the space needed for compactions.
> We are running on SSD-drives as latency is a key-issue.
>
> As test I have imported one Column Family from 3 production-nodes to a 3
> node test-cluster.
> The data on the 3 nodes ranges from 19-33GB. (with at least one large
> SSTable (Tiered size - recently compacted)).
>
> After loading this data to the 3 test-nodes, and running scrub and repair,
> I took a backup of the data so I have good test-set of data to work on.
> Then I changed changed to leveled compaction, using the cassandra-cli:
>
> UPDATE COLUMN FAMILY TestCF1 WITH
> compaction_strategy=LeveledCompactionStrategy;
> I could see the change being written to the logfile on all nodes.
>
> Then I don't know for for sure if I need to run anything else to make the
> change happen - or if it's just to wait.
> My test-cluster does not receive new data.
>
> For this  KS & CF and on each of the nodes I have tried some or several
> of: upgradesstable, scrub, compact, cleanup and repair - each task taking
> between 40 minutes and 4 hours.
> With the exception of compact that returns almost immediately with no
> visible compactions made.
>
> On some node I ended up with over 30000 files with the default 5MB size
> for leveled compaction, on another node it didn't look like anything has
> been done and I still have a 19GB SSTable.
>
> I then made another change.
> UPDATE COLUMN FAMILY TestCF1 WITH
> compaction_strategy=LeveledCompactionStrategy AND
> compaction_strategy_options=[{sstable_size_in_mb: 64}];
> WARNING: [{}] strategy_options syntax is deprecated, please use {}
> Which is probably wrong in the documentation - and should be:
> UPDATE COLUMN FAMILY TestCF1 WITH
> compaction_strategy=LeveledCompactionStrategy AND
> compaction_strategy_options={sstable_size_in_mb: 64};
>
> I think that we will be able to find the data in 3 searches with a 64MB
> size - and still only use around 700MB while doing compactions - and keep
> the number of files ~3000 per CF.
>
> A few days later it looks like I still have a mix between original huge
> SStables, 5MB once - and some nodes has 64MB files as well.
> Do I need to do something special to clean this up?
> I have tried another scrub /upgradesstables/clean - but nothing seems to
> do any change to me.
>
> Finally I have also tried to enable compression:
> UPDATE COLUMN FAMILY TestCF1 WITH
> compression_options=[{sstable_compression:SnappyCompressor,
> chunk_length_kb:64}];
> - which results in the same [{}] - warning.
>
> As you can see below - this created CompressionInfo.db - files on some
> nodes - but not on all.
>
> *Is there a way I can force Teired sstables to be converted into Leveled
> once - and then to compression as well?*
> *Why are the original file (Tiered Sized SSTables still present on
> testnode1 - when is it supposed to delete them?*
>
> *Can I change back and forth between compression (on/off - or chunksizes)
> - and between Leveled vs Size Tiered compaction?*
> *Is there a way to see if the node is done - or waiting for something?*
> *When is it safe to apply another setting - does it have to complete one
> reorg before moving on to the next?*
>
> *Any input or own experiences are warmly welcome.*
>
> Best regards, Johan
>
>
> Some lines of example directory-listings below.:
>
> Some files for testnode 3. (looks like it's still have the original Size
> Tiered files around, and a mixture of compressed 64MB files - and 5MB
> files?
>
> total 19G
> drwxr-xr-x 3 cass cass 4.0K Mar 13 17:11 snapshots
> -rw-r--r-- 1 cass cass 6.0G Mar 13 18:42 TestCF1-hc-6346-Index.db
> -rw-r--r-- 1 cass cass 1.3M Mar 13 18:42 TestCF1-hc-6346-Filter.db
> -rw-r--r-- 1 cass cass  13G Mar 13 18:42 TestCF1-hc-6346-Data.db
> -rw-r--r-- 1 cass cass 2.4M Mar 13 18:42 TestCF1-hc-6346-CompressionInfo.db
> -rw-r--r-- 1 cass cass 4.3K Mar 13 18:42 TestCF1-hc-6346-Statistics.db
> -rw-r--r-- 1 cass cass 195K Mar 13 18:42 TestCF1-hc-6347-Filter.db
> -rw-r--r-- 1 cass cass 4.9M Mar 13 18:42 TestCF1-hc-6347-Index.db
> -rw-r--r-- 1 cass cass 9.0M Mar 13 18:42 TestCF1-hc-6347-Data.db
> -rw-r--r-- 1 cass cass 4.3K Mar 13 18:42 TestCF1-hc-6347-Statistics.db
> -rw-r--r-- 1 cass cass 2.0K Mar 13 18:42 TestCF1-hc-6347-CompressionInfo.db
> -rw-r--r-- 1 cass cass  11K Mar 13 18:43 TestCF1-hc-6351-CompressionInfo.db
> -rw-r--r-- 1 cass cass  52M Mar 13 18:43 TestCF1-hc-6351-Data.db
> -rw-r--r-- 1 cass cass 1.1M Mar 13 18:43 TestCF1-hc-6351-Filter.db
> -rw-r--r-- 1 cass cass  28M Mar 13 18:43 TestCF1-hc-6351-Index.db
> -rw-r--r-- 1 cass cass 4.3K Mar 13 18:43 TestCF1-hc-6351-Statistics.db
> -rw-r--r-- 1 cass cass  401 Mar 13 18:43 TestCF1.json
> -rw-r--r-- 1 cass cass  950 Mar 13 18:43 TestCF1-hc-6350-CompressionInfo.db
> -rw-r--r-- 1 cass cass 4.3M Mar 13 18:43 TestCF1-hc-6350-Data.db
> -rw-r--r-- 1 cass cass  93K Mar 13 18:43 TestCF1-hc-6350-Filter.db
> -rw-r--r-- 1 cass cass 2.3M Mar 13 18:43 TestCF1-hc-6350-Index.db
> -rw-r--r-- 1 cass cass 4.3K Mar 13 18:43 TestCF1-hc-6350-Statistics.db
> -rw-r--r-- 1 cass cass  400 Mar 13 18:43 TestCF1-old.json
>
>
>
> Some DB-files ordered by size for testnode1:
> No compressed files - but has the original Size-tiered files - as well as
> 5MB Leveled compaction-files - but no 64 MB once.
>
> total 83G
> -rw-r--r-- 1 cass cass   33G Mar 13 07:14 TestCF1-hc-33504-Data.db
> -rw-r--r-- 1 cass cass   11G Mar 13 07:14 TestCF1-hc-33504-Index.db
> -rw-r--r-- 1 cass cass  407M Mar 13 07:14 TestCF1-hc-33504-Filter.db
> -rw-r--r-- 1 cass cass  5.1M Mar 13 05:27 TestCF1-hc-33338-Data.db
> -rw-r--r-- 1 cass cass  5.1M Mar 13 08:54 TestCF1-hc-38997-Data.db
> -rw-r--r-- 1 cass cass  5.1M Mar 13 07:15 TestCF1-hc-33513-Data.db
>
>
>
>   --
>
> *Johan Elmerfjord* | Sr. Systems Administration/Mgr, EMEA | Adobe
> Systems, Product Technical Operations | p. +45 3231 6008 | x86008 | cell. +46
> 735 101 444 | Jelmerfj@adobe.com
>
>

Mime
View raw message