incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radim Kolar <...@filez.com>
Subject Re: repair broke TTL based expiration
Date Mon, 19 Mar 2012 21:48:12 GMT
Dne 19.3.2012 20:28, igor@4friends.od.ua napsal(a):
>
> Hello
>
> Datasize should decrease during minor compactions. Check logs for 
> compactions results.
>
they do but not as much as i expect. Look at sizes and file dates:

-rw-r--r--  1 root  wheel   5.4G Feb 23 17:03 resultcache-hc-27045-Data.db
-rw-r--r--  1 root  wheel   6.4G Feb 23 17:11 resultcache-hc-27047-Data.db
-rw-r--r--  1 root  wheel   5.5G Feb 25 06:40 resultcache-hc-27167-Data.db
-rw-r--r--  1 root  wheel   2.2G Mar  2 05:03 resultcache-hc-27323-Data.db
-rw-r--r--  1 root  wheel   2.0G Mar  5 09:15 resultcache-hc-27542-Data.db
-rw-r--r--  1 root  wheel   2.2G Mar 12 23:24 resultcache-hc-27791-Data.db
-rw-r--r--  1 root  wheel   468M Mar 15 03:27 resultcache-hc-27822-Data.db
-rw-r--r--  1 root  wheel   483M Mar 16 05:23 resultcache-hc-27853-Data.db
-rw-r--r--  1 root  wheel    53M Mar 17 05:33 resultcache-hc-27901-Data.db
-rw-r--r--  1 root  wheel   485M Mar 17 09:37 resultcache-hc-27930-Data.db
-rw-r--r--  1 root  wheel   480M Mar 19 00:45 resultcache-hc-27961-Data.db
-rw-r--r--  1 root  wheel    95M Mar 19 09:35 resultcache-hc-27967-Data.db
-rw-r--r--  1 root  wheel    98M Mar 19 17:04 resultcache-hc-27973-Data.db
-rw-r--r--  1 root  wheel    19M Mar 19 18:23 resultcache-hc-27974-Data.db
-rw-r--r--  1 root  wheel    19M Mar 19 19:50 resultcache-hc-27975-Data.db
-rw-r--r--  1 root  wheel    19M Mar 19 21:17 resultcache-hc-27976-Data.db
-rw-r--r--  1 root  wheel    19M Mar 19 22:05 resultcache-hc-27977-Data.db

I insert everything with 7days TTL + 10 days tombstone expiration.  This 
means that there should not be in ideal case nothing older then Mar 2.

These 3x5 GB files waits to be compacted. Because they contains only 
tombstones, cassandra should make some optimalizations - mark sstable as 
tombstone only, remember time of latest tombstone and delete entire 
sstable without need to merge it first.

1. Question is why create tombstone after row expiration at all, because 
it will expire at all cluster nodes at same time without need to be deleted.
2. Its super column family. When i dump oldest sstable, i wonder why it 
looks like this:

{
"7777772c61727469636c65736f61702e636f6d": {},
"7175616b652d34": {"1": {"deletedAt": -9223372036854775808, 
"subColumns": [["crc32","4f34455c",1328220892597002,"d"], 
["id","4f34455c",1328220892597000,"d"], 
["name","4f34455c",1328220892597001,"d"], 
["size","4f34455c",1328220892597003,"d"]]}, "2": {"deletedAt": 
-9223372036854775808, "subColumns": 
[["crc32","4f34455c",1328220892597007,"d"], 
["id","4f34455c",1328220892597005,"d"], 
["name","4f34455c",1328220892597006,"d"], 
["size","4f34455c",1328220892597008,"d"]]}, "3": {"deletedAt": 
-9223372036854775808, "subColumns":

* all subcolums are deleted. why to keep their names in table? isnt 
marking column as deleted enough? "1": {"deletedAt": 
-9223372036854775808"} enough?
* another question is why was not tombstone entire row, because all its 
members were expired.

Mime
View raw message