cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Talbot <btal...@aeriagames.com>
Subject Re: LCS not removing rows with all TTL expired columns
Date Fri, 18 Jan 2013 02:50:39 GMT
Bleh, I rushed out the email before some meetings and I messed something
up.  Working on reproducing now with better notes this time.

-Bryan



On Thu, Jan 17, 2013 at 4:45 PM, Derek Williams <derek@fyrie.net> wrote:

> When you ran this test, is that the exact schema you used? I'm not seeing
> where you are setting gc_grace to 0 (although I could just be blind, it
> happens).
>
>
> On Thu, Jan 17, 2013 at 5:01 PM, Bryan Talbot <btalbot@aeriagames.com>wrote:
>
>> I'm able to reproduce this behavior on my laptop using 1.1.5, 1.1.7,
>> 1.1.8, a trivial schema, and a simple script that just inserts rows.  If
>> the TTL is small enough so that all LCS data fits in generation 0 then the
>> rows seem to be removed with TTL expires as desired.  However, if the
>> insertion rate is high enough or the TTL long enough then the data keep
>> accumulating for far longer than expected.
>>
>> Using 120 second TTL and a single threaded php insertion script my MBP
>> with SSD retained almost all of the data.  120 seconds should accumulate
>> 5-10 MB of data.  I would expect that TTL rows to be removed eventually and
>> for the cassandra load to level off at some reasonable value near 10 MB.
>>  After running for 2 hours and with a cassandra load of ~550 MB I stopped
>> the test.
>>
>> The schema is
>>
>> create keyspace test
>>   with placement_strategy = 'SimpleStrategy'
>>   and strategy_options = {replication_factor : 1}
>>   and durable_writes = true;
>>
>> use test;
>>
>> create column family test
>>   with column_type = 'Standard'
>>   and comparator = 'UTF8Type'
>>   and default_validation_class = 'UTF8Type'
>>   and key_validation_class = 'TimeUUIDType'
>>   and compaction_strategy =
>> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
>>   and caching = 'NONE'
>>   and bloom_filter_fp_chance = 1.0
>>   and column_metadata = [
>>     {column_name : 'a',
>>     validation_class : LongType}];
>>
>>
>> and the insert script is
>>
>> <?php
>>
>> require_once('phpcassa/1.0.a.5/autoload.php');
>>
>> use phpcassa\Connection\ConnectionPool;
>> use phpcassa\ColumnFamily;
>> use phpcassa\SystemManager;
>> use phpcassa\UUID;
>>
>> // Connect to test keyspace and column family
>> $sys = new SystemManager('127.0.0.1');
>>
>> // Start a connection pool, create our ColumnFamily instance
>> $pool = new ConnectionPool('test', array('127.0.0.1'));
>> $testCf = new ColumnFamily($pool, 'test');
>>
>> // Insert records
>> while( 1 ) {
>>   $testCf->insert(UUID::uuid1(), array("a" => 1), null, 120);
>> }
>>
>> // Close our connections
>> $pool->close();
>> $sys->close();
>>
>> ?>
>>
>>
>> -Bryan
>>
>>
>>
>>
>> On Thu, Jan 17, 2013 at 10:11 AM, Bryan Talbot <btalbot@aeriagames.com>wrote:
>>
>>> We are using LCS and the particular row I've referenced has been
>>> involved in several compactions after all columns have TTL expired.  The
>>> most recent one was again this morning and the row is still there -- TTL
>>> expired for several days now with gc_grace=0 and several compactions later
>>> ...
>>>
>>>
>>> $> ./bin/nodetool -h localhost getsstables metrics request_summary
>>> 459fb460-5ace-11e2-9b92-11d67b6163b4
>>>
>>> /virtual/cassandra/data/data/metrics/request_summary/metrics-request_summary-he-448955-Data.db
>>>
>>> $> ls -alF
>>> /virtual/cassandra/data/data/metrics/request_summary/metrics-request_summary-he-448955-Data.db
>>> -rw-rw-r-- 1 sandra sandra 5246509 Jan 17 06:54
>>> /virtual/cassandra/data/data/metrics/request_summary/metrics-request_summary-he-448955-Data.db
>>>
>>>
>>> $> ./bin/sstable2json
>>> /virtual/cassandra/data/data/metrics/request_summary/metrics-request_summary-he-448955-Data.db
>>> -k $(echo -n 459fb460-5ace-11e2-9b92-11d67b6163b4 | hexdump  -e '36/1 "%x"')
>>>  {
>>> "34353966623436302d356163652d313165322d396239322d313164363762363136336234":
>>> [["app_name","50f21d3d",1357785277207001,"d"],
>>> ["client_ip","50f21d3d",1357785277207001,"d"],
>>> ["client_req_id","50f21d3d",1357785277207001,"d"],
>>> ["mysql_call_cnt","50f21d3d",1357785277207001,"d"],
>>> ["mysql_duration_us","50f21d3d",1357785277207001,"d"],
>>> ["mysql_failure_call_cnt","50f21d3d",1357785277207001,"d"],
>>> ["mysql_success_call_cnt","50f21d3d",1357785277207001,"d"],
>>> ["req_duration_us","50f21d3d",1357785277207001,"d"],
>>> ["req_finish_time_us","50f21d3d",1357785277207001,"d"],
>>> ["req_method","50f21d3d",1357785277207001,"d"],
>>> ["req_service","50f21d3d",1357785277207001,"d"],
>>> ["req_start_time_us","50f21d3d",1357785277207001,"d"],
>>> ["success","50f21d3d",1357785277207001,"d"]]
>>> }
>>>
>>>
>>> My experience with TTL columns so far has been pretty similar to
>>> Viktor's in that the only way to keep them row count under control is to
>>> force major compactions.  In real world use, STCS and LCS both leave TTL
>>> expired rows around forever as far as I can tell.  When testing with
>>> minimal data, removal of TTL expired rows seem to work as expected but in
>>> this case there seems to be some divergence from real life work and test
>>> samples.
>>>
>>> -Bryan
>>>
>>>
>>>
>>>
>>> On Thu, Jan 17, 2013 at 1:47 AM, Viktor Jevdokimov <
>>> Viktor.Jevdokimov@adform.com> wrote:
>>>
>>>>  @Bryan,****
>>>>
>>>> ** **
>>>>
>>>> To keep data size as low as possible with TTL columns we still use STCS
>>>> and nightly major compactions.****
>>>>
>>>> ** **
>>>>
>>>> Experience with LCS was not successful in our case, data size keeps too
>>>> high along with amount of compactions.****
>>>>
>>>> ** **
>>>>
>>>> IMO, before 1.2, LCS was good for CFs without TTL or high delete rate.
>>>> I have not tested 1.2 LCS behavior, we’re still on 1.0.x****
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>    Best regards / Pagarbiai
>>>> *Viktor Jevdokimov*
>>>> Senior Developer
>>>>
>>>> Email: Viktor.Jevdokimov@adform.com
>>>> Phone: +370 5 212 3063, Fax +370 5 261 0453
>>>> J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
>>>> Follow us on Twitter: @adforminsider<http://twitter.com/#!/adforminsider>
>>>> Take a ride with Adform's Rich Media Suite<http://vimeo.com/adform/richmedia>
>>>>  [image: Adform News] <http://www.adform.com>
>>>> [image: Adform awarded the Best Employer 2012]
>>>> <http://www.adform.com/site/blog/adform/adform-takes-top-spot-in-best-employer-survey/>
>>>>
>>>> Disclaimer: The information contained in this message and attachments
>>>> is intended solely for the attention and use of the named addressee and may
>>>> be confidential. If you are not the intended recipient, you are reminded
>>>> that the information remains the property of the sender. You must not use,
>>>> disclose, distribute, copy, print or rely on this e-mail. If you have
>>>> received this message in error, please contact the sender immediately and
>>>> irrevocably delete this message and any copies.
>>>>
>>>>   *From:* aaron morton [mailto:aaron@thelastpickle.com]
>>>> *Sent:* Thursday, January 17, 2013 06:24
>>>> *To:* user@cassandra.apache.org
>>>> *Subject:* Re: LCS not removing rows with all TTL expired columns****
>>>>
>>>> ** **
>>>>
>>>> Minor compaction (with Size Tiered) will only purge tombstones if all
>>>> fragments of a row are contained in the SSTables being compacted. So if you
>>>> have a long lived row, that is present in many size tiers, the columns will
>>>> not be purged. ****
>>>>
>>>> ** **
>>>>
>>>>   (thus compacted compacted) 3 days after all columns for that row had
>>>> expired****
>>>>
>>>> Tombstones have to get on disk, even if you set the gc_grace_seconds to
>>>> 0. If not they do not get a chance to delete previous versions of the
>>>> column which already exist on disk. So when the compaction ran your
>>>> ExpiringColumn was turned into a DeletedColumn and placed on disk. ****
>>>>
>>>> ** **
>>>>
>>>> I would expect the next round of compaction to remove these columns. **
>>>> **
>>>>
>>>> ** **
>>>>
>>>> There is a new feature in 1.2 that may help you here. It will do a
>>>> special compaction of individual sstables when they have a certain
>>>> proportion of dead columns
>>>> https://issues.apache.org/jira/browse/CASSANDRA-3442 ****
>>>>
>>>> ** **
>>>>
>>>> Also interested to know if LCS helps. ****
>>>>
>>>> ** **
>>>>
>>>> Cheers****
>>>>
>>>>  ****
>>>>
>>>> ** **
>>>>
>>>> -----------------****
>>>>
>>>> Aaron Morton****
>>>>
>>>> Freelance Cassandra Developer****
>>>>
>>>> New Zealand****
>>>>
>>>> ** **
>>>>
>>>> @aaronmorton****
>>>>
>>>> http://www.thelastpickle.com****
>>>>
>>>> ** **
>>>>
>>>> On 17/01/2013, at 2:55 PM, Bryan Talbot <btalbot@aeriagames.com> wrote:
>>>> ****
>>>>
>>>>
>>>>
>>>> ****
>>>>
>>>> According to the timestamps (see original post) the SSTable was written
>>>> (thus compacted compacted) 3 days after all columns for that row had
>>>> expired and 6 days after the row was created; yet all columns are still
>>>> showing up in the SSTable.  Note that the column shows now rows when a
>>>> "get" for that key is run so that's working correctly, but the data is
>>>> lugged around far longer than it should be -- maybe forever.****
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>> -Bryan****
>>>>
>>>> ** **
>>>>
>>>> On Wed, Jan 16, 2013 at 5:44 PM, Andrey Ilinykh <ailinykh@gmail.com>
>>>> wrote:****
>>>>
>>>> To get column removed you have to meet two requirements ****
>>>>
>>>> 1. column should be expired****
>>>>
>>>> 2. after that CF gets compacted****
>>>>
>>>> ** **
>>>>
>>>> I guess your expired columns are propagated to high tier CF, which gets
>>>> compacted rarely.****
>>>>
>>>> So, you have to wait when high tier CF gets compacted.  ****
>>>>
>>>> ** **
>>>>
>>>> Andrey****
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>> On Wed, Jan 16, 2013 at 11:39 AM, Bryan Talbot <btalbot@aeriagames.com>
>>>> wrote:****
>>>>
>>>> On cassandra 1.1.5 with a write heavy workload, we're having problems
>>>> getting rows to be compacted away (removed) even though all columns have
>>>> expired TTL.  We've tried size tiered and now leveled and are seeing the
>>>> same symptom: the data stays around essentially forever.  ****
>>>>
>>>> ** **
>>>>
>>>> Currently we write all columns with a TTL of 72 hours (259200 seconds)
>>>> and expect to add 10 GB of data to this CF per day per node.  Each node
>>>> currently has 73 GB for the affected CF and shows no indications that old
>>>> rows will be removed on their own.****
>>>>
>>>> ** **
>>>>
>>>> Why aren't rows being removed?  Below is some data from a sample row
>>>> which should have been removed several days ago but is still around even
>>>> though it has been involved in numerous compactions since being expired.
>>>> ****
>>>>
>>>> ** **
>>>>
>>>> $> ./bin/nodetool -h localhost getsstables metrics request_summary
>>>> 459fb460-5ace-11e2-9b92-11d67b6163b4****
>>>>
>>>>
>>>> /virtual/cassandra/data/data/metrics/request_summary/metrics-request_summary-he-386179-Data.db
>>>> ****
>>>>
>>>> ** **
>>>>
>>>> $> ls -alF
>>>> /virtual/cassandra/data/data/metrics/request_summary/metrics-request_summary-he-386179-Data.db
>>>> ****
>>>>
>>>> -rw-rw-r-- 1 sandra sandra 5252320 Jan 16 08:42
>>>> /virtual/cassandra/data/data/metrics/request_summary/metrics-request_summary-he-386179-Data.db
>>>> ****
>>>>
>>>> ** **
>>>>
>>>> $> ./bin/sstable2json
>>>> /virtual/cassandra/data/data/metrics/request_summary/metrics-request_summary-he-386179-Data.db
>>>> -k $(echo -n 459fb460-5ace-11e2-9b92-11d67b6163b4 | hexdump  -e '36/1 "%x"')
>>>> ****
>>>>
>>>> {****
>>>>
>>>> "34353966623436302d356163652d313165322d396239322d313164363762363136336234":
>>>> [["app_name","50f21d3d",1357785277207001,"d"],
>>>> ["client_ip","50f21d3d",1357785277207001,"d"],
>>>> ["client_req_id","50f21d3d",1357785277207001,"d"],
>>>> ["mysql_call_cnt","50f21d3d",1357785277207001,"d"],
>>>> ["mysql_duration_us","50f21d3d",1357785277207001,"d"],
>>>> ["mysql_failure_call_cnt","50f21d3d",1357785277207001,"d"],
>>>> ["mysql_success_call_cnt","50f21d3d",1357785277207001,"d"],
>>>> ["req_duration_us","50f21d3d",1357785277207001,"d"],
>>>> ["req_finish_time_us","50f21d3d",1357785277207001,"d"],
>>>> ["req_method","50f21d3d",1357785277207001,"d"],
>>>> ["req_service","50f21d3d",1357785277207001,"d"],
>>>> ["req_start_time_us","50f21d3d",1357785277207001,"d"],
>>>> ["success","50f21d3d",1357785277207001,"d"]]****
>>>>
>>>> }****
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>> Decoding the column timestamps to shows that the columns were written
>>>> at "Thu, 10 Jan 2013 02:34:37 GMT" and that their TTL expired at "Sun, 13
>>>> Jan 2013 02:34:37 GMT".  The date of the SSTable shows that it was
>>>> generated on Jan 16 which is 3 days after all columns have TTL-ed out.*
>>>> ***
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>> The schema shows that gc_grace is set to 0 since this data is
>>>> write-once, read-seldom and is never updated or deleted.****
>>>>
>>>> ** **
>>>>
>>>> create column family request_summary****
>>>>
>>>>   with column_type = 'Standard'****
>>>>
>>>>   and comparator = 'UTF8Type'****
>>>>
>>>>   and default_validation_class = 'UTF8Type'****
>>>>
>>>>   and key_validation_class = 'UTF8Type'****
>>>>
>>>>   and read_repair_chance = 0.1****
>>>>
>>>>   and dclocal_read_repair_chance = 0.0****
>>>>
>>>>   and gc_grace = 0****
>>>>
>>>>   and min_compaction_threshold = 4****
>>>>
>>>>   and max_compaction_threshold = 32****
>>>>
>>>>   and replicate_on_write = true****
>>>>
>>>>   and compaction_strategy =
>>>> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'****
>>>>
>>>>   and caching = 'NONE'****
>>>>
>>>>   and bloom_filter_fp_chance = 1.0****
>>>>
>>>>   and compression_options = {'chunk_length_kb' : '64',
>>>> 'sstable_compression' :
>>>> 'org.apache.cassandra.io.compress.SnappyCompressor'};****
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>> Thanks in advance for help in understanding why rows such as this are
>>>> not removed!****
>>>>
>>>> ** **
>>>>
>>>> -Bryan****
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>
>>>
>>>
>>
>
>
> --
> Derek Williams
>



-- 
Bryan Talbot
Architect / Platform team lead, Aeria Games and Entertainment
Silicon Valley | Berlin | Tokyo | Sao Paulo

Mime
View raw message