incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: TTL on SecondaryIndex Columns. A bug?
Date Wed, 19 Dec 2012 21:33:20 GMT
> Please try to run Cassandra with -Xms1927M -Xmx1927M -Xmn400M
Done and I now get your repo case…

[default@ks123] get cf1 where 'indexedColumn'='65';

0 Row Returned.
Elapsed time: 1.44 msec(s).


[default@ks123] get cf1 where 'indexedColumn'='66';
-------------------
RowKey: 66
=> (column=1, value=val, timestamp=1355952222439049, ttl=7884000)
=> (column=10, value=val, timestamp=1355952222439269, ttl=7884000)
...
=> (column=indexedColumn, value=66, timestamp=1355952223881937, ttl=7887600)

Looking into it now. 

Thanks

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 19/12/2012, at 9:56 PM, Roland Gude <roland.gude@ez.no> wrote:

> I think this might be https://issues.apache.org/jira/browse/CASSANDRA-4670
> Unfortunately apart from me no one was yet able to reproduce.
> 
> Check if data is available before/after compaction
> If you have leveled compaction it is hard to test because you cannot trigger compaction
manually.
> 
> -----Urspr√ľngliche Nachricht-----
> Von: Alexei Bakanov [mailto:russisk@gmail.com] 
> Gesendet: Mittwoch, 19. Dezember 2012 09:35
> An: user@cassandra.apache.org
> Betreff: Re: TTL on SecondaryIndex Columns. A bug?
> 
> I'm running on a single node on my laptop.
> It looks like the point when rows dissapear from the index depends on JVM memory settings.
With more memory it needs more data to feed in before things start disappearing.
> Please try to run Cassandra with -Xms1927M -Xmx1927M -Xmn400M
> 
> To be sure, try to get rows for 'indexedColumn'='1':
> 
> [default@ks123] get cf1 where 'indexedColumn'='1';
> 
> 0 Row Returned.
> 
> Thanks
> 
> 
> On 19 December 2012 05:15, aaron morton <aaron@thelastpickle.com> wrote:
>> Thanks for the nice steps to reproduce.
>> 
>> I ran this on my MBP using C* 1.1.7 and got the expected results, both 
>> get's returned a row.
>> 
>> Were you running against a single node or a cluster ? If a cluster did 
>> you change the CL, cassandra-cli defaults to ONE.
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 18/12/2012, at 9:44 PM, Alexei Bakanov <russisk@gmail.com> wrote:
>> 
>> Hi,
>> 
>> We are having an issue with TTL on Secondary index columns. We get 0 
>> rows in return when running queries on indexed columns that have TTL.
>> Everything works fine with small amounts of data, but when we get over 
>> a ceratin threshold it looks like older rows dissapear from the index.
>> In the example below we create 70 rows with 45k columns each + one 
>> indexed column with just the rowkey as value, so we have one row per 
>> indexed value. When the script is finished the index contains rows 
>> 66-69. Rows 0-65 are gone from the index.
>> Using 'indexedColumn' without TTL fixes the problem.
>> 
>> 
>> ------------- SCHEMA START ----------------- create keyspace ks123  
>> with placement_strategy = 'NetworkTopologyStrategy'
>> and strategy_options = {datacenter1 : 1}  and durable_writes = true;
>> 
>> use ks123;
>> 
>> create column family cf1
>> with column_type = 'Standard'
>> and comparator = 'AsciiType'
>> and default_validation_class = 'AsciiType'
>> and key_validation_class = 'AsciiType'
>> and read_repair_chance = 0.1
>> and dclocal_read_repair_chance = 0.0
>> and gc_grace = 864000
>> and min_compaction_threshold = 4
>> and max_compaction_threshold = 32
>> and replicate_on_write = true
>> and compaction_strategy =
>> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>> and caching = 'KEYS_ONLY'
>> and column_metadata = [
>>   {column_name : 'indexedColumn',
>>   validation_class : AsciiType,
>>   index_name : 'INDEX1',
>>   index_type : 0}]
>> and compression_options = {'sstable_compression' :
>> 'org.apache.cassandra.io.compress.SnappyCompressor'};
>> ------------- SCHEMA FINISH -----------------
>> 
>> ------------- POPULATE START ----------------- from pycassa.batch 
>> import Mutator import pycassa
>> 
>> pool = pycassa.ConnectionPool('ks123') cf = pycassa.ColumnFamily(pool, 
>> 'cf1')
>> 
>> for rowKey in xrange(70):
>>   b = Mutator(pool)
>>   for datapoint in xrange(1, 45001):
>>       b.insert(cf,str(rowKey), {str(datapoint): 'val'}, ttl=7884000);
>>   b.insert(cf, str(rowKey), {'indexedColumn': str(rowKey)}, ttl=7887600);
>>   print 'row %d' % rowKey
>>   b.send()
>>   b = Mutator(pool)
>> 
>> pool.dispose()
>> ------------- POPULATE FINISH -----------------
>> 
>> ------------- QUERY START ----------------- [default@ks123] get cf1 
>> where 'indexedColumn'='65';
>> 
>> 0 Row Returned.
>> Elapsed time: 2.38 msec(s).
>> 
>> [default@ks123] get cf1 where 'indexedColumn'='66';
>> -------------------
>> RowKey: 66
>> => (column=1, value=val, timestamp=1355818765548964, ttl=7884000) ...
>> => (column=10087, value=val, timestamp=1355818766075538, ttl=7884000) 
>> => (column=indexedColumn, value=66, timestamp=1355818768119334, 
>> ttl=7887600)
>> 
>> 1 Row Returned.
>> Elapsed time: 31 msec(s).
>> ------------- QUERY FINISH -----------------
>> 
>> This is all using Cassandra 1.1.7 with default settings.
>> 
>> Best regards,
>> 
>> Alexei Bakanov
>> 
>> 
> 
> 


Mime
View raw message