incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: TTL on SecondaryIndex Columns. A bug?
Date Thu, 20 Dec 2012 04:10:54 GMT
Well that was fun https://issues.apache.org/jira/browse/CASSANDRA-5079

Just testing my idea of a fix now.
Cheers
-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 20/12/2012, at 10:33 AM, aaron morton <aaron@thelastpickle.com> wrote:

>> Please try to run Cassandra with -Xms1927M -Xmx1927M -Xmn400M
> Done and I now get your repo case…
> 
> [default@ks123] get cf1 where 'indexedColumn'='65';
> 
> 0 Row Returned.
> Elapsed time: 1.44 msec(s).
> 
> 
> [default@ks123] get cf1 where 'indexedColumn'='66';
> -------------------
> RowKey: 66
> => (column=1, value=val, timestamp=1355952222439049, ttl=7884000)
> => (column=10, value=val, timestamp=1355952222439269, ttl=7884000)
> ...
> => (column=indexedColumn, value=66, timestamp=1355952223881937, ttl=7887600)
> 
> Looking into it now. 
> 
> Thanks
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 19/12/2012, at 9:56 PM, Roland Gude <roland.gude@ez.no> wrote:
> 
>> I think this might be https://issues.apache.org/jira/browse/CASSANDRA-4670
>> Unfortunately apart from me no one was yet able to reproduce.
>> 
>> Check if data is available before/after compaction
>> If you have leveled compaction it is hard to test because you cannot trigger compaction
manually.
>> 
>> -----Urspr√ľngliche Nachricht-----
>> Von: Alexei Bakanov [mailto:russisk@gmail.com] 
>> Gesendet: Mittwoch, 19. Dezember 2012 09:35
>> An: user@cassandra.apache.org
>> Betreff: Re: TTL on SecondaryIndex Columns. A bug?
>> 
>> I'm running on a single node on my laptop.
>> It looks like the point when rows dissapear from the index depends on JVM memory
settings. With more memory it needs more data to feed in before things start disappearing.
>> Please try to run Cassandra with -Xms1927M -Xmx1927M -Xmn400M
>> 
>> To be sure, try to get rows for 'indexedColumn'='1':
>> 
>> [default@ks123] get cf1 where 'indexedColumn'='1';
>> 
>> 0 Row Returned.
>> 
>> Thanks
>> 
>> 
>> On 19 December 2012 05:15, aaron morton <aaron@thelastpickle.com> wrote:
>>> Thanks for the nice steps to reproduce.
>>> 
>>> I ran this on my MBP using C* 1.1.7 and got the expected results, both 
>>> get's returned a row.
>>> 
>>> Were you running against a single node or a cluster ? If a cluster did 
>>> you change the CL, cassandra-cli defaults to ONE.
>>> 
>>> Cheers
>>> 
>>> -----------------
>>> Aaron Morton
>>> Freelance Cassandra Developer
>>> New Zealand
>>> 
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> 
>>> On 18/12/2012, at 9:44 PM, Alexei Bakanov <russisk@gmail.com> wrote:
>>> 
>>> Hi,
>>> 
>>> We are having an issue with TTL on Secondary index columns. We get 0 
>>> rows in return when running queries on indexed columns that have TTL.
>>> Everything works fine with small amounts of data, but when we get over 
>>> a ceratin threshold it looks like older rows dissapear from the index.
>>> In the example below we create 70 rows with 45k columns each + one 
>>> indexed column with just the rowkey as value, so we have one row per 
>>> indexed value. When the script is finished the index contains rows 
>>> 66-69. Rows 0-65 are gone from the index.
>>> Using 'indexedColumn' without TTL fixes the problem.
>>> 
>>> 
>>> ------------- SCHEMA START ----------------- create keyspace ks123  
>>> with placement_strategy = 'NetworkTopologyStrategy'
>>> and strategy_options = {datacenter1 : 1}  and durable_writes = true;
>>> 
>>> use ks123;
>>> 
>>> create column family cf1
>>> with column_type = 'Standard'
>>> and comparator = 'AsciiType'
>>> and default_validation_class = 'AsciiType'
>>> and key_validation_class = 'AsciiType'
>>> and read_repair_chance = 0.1
>>> and dclocal_read_repair_chance = 0.0
>>> and gc_grace = 864000
>>> and min_compaction_threshold = 4
>>> and max_compaction_threshold = 32
>>> and replicate_on_write = true
>>> and compaction_strategy =
>>> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>>> and caching = 'KEYS_ONLY'
>>> and column_metadata = [
>>>   {column_name : 'indexedColumn',
>>>   validation_class : AsciiType,
>>>   index_name : 'INDEX1',
>>>   index_type : 0}]
>>> and compression_options = {'sstable_compression' :
>>> 'org.apache.cassandra.io.compress.SnappyCompressor'};
>>> ------------- SCHEMA FINISH -----------------
>>> 
>>> ------------- POPULATE START ----------------- from pycassa.batch 
>>> import Mutator import pycassa
>>> 
>>> pool = pycassa.ConnectionPool('ks123') cf = pycassa.ColumnFamily(pool, 
>>> 'cf1')
>>> 
>>> for rowKey in xrange(70):
>>>   b = Mutator(pool)
>>>   for datapoint in xrange(1, 45001):
>>>       b.insert(cf,str(rowKey), {str(datapoint): 'val'}, ttl=7884000);
>>>   b.insert(cf, str(rowKey), {'indexedColumn': str(rowKey)}, ttl=7887600);
>>>   print 'row %d' % rowKey
>>>   b.send()
>>>   b = Mutator(pool)
>>> 
>>> pool.dispose()
>>> ------------- POPULATE FINISH -----------------
>>> 
>>> ------------- QUERY START ----------------- [default@ks123] get cf1 
>>> where 'indexedColumn'='65';
>>> 
>>> 0 Row Returned.
>>> Elapsed time: 2.38 msec(s).
>>> 
>>> [default@ks123] get cf1 where 'indexedColumn'='66';
>>> -------------------
>>> RowKey: 66
>>> => (column=1, value=val, timestamp=1355818765548964, ttl=7884000) ...
>>> => (column=10087, value=val, timestamp=1355818766075538, ttl=7884000) 
>>> => (column=indexedColumn, value=66, timestamp=1355818768119334, 
>>> ttl=7887600)
>>> 
>>> 1 Row Returned.
>>> Elapsed time: 31 msec(s).
>>> ------------- QUERY FINISH -----------------
>>> 
>>> This is all using Cassandra 1.1.7 with default settings.
>>> 
>>> Best regards,
>>> 
>>> Alexei Bakanov
>>> 
>>> 
>> 
>> 
> 


Mime
View raw message