incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roland Gude <roland.g...@ez.no>
Subject AW: TTL on SecondaryIndex Columns. A bug?
Date Wed, 19 Dec 2012 08:56:27 GMT
I think this might be https://issues.apache.org/jira/browse/CASSANDRA-4670
Unfortunately apart from me no one was yet able to reproduce.

Check if data is available before/after compaction
If you have leveled compaction it is hard to test because you cannot trigger compaction manually.

-----Urspr√ľngliche Nachricht-----
Von: Alexei Bakanov [mailto:russisk@gmail.com] 
Gesendet: Mittwoch, 19. Dezember 2012 09:35
An: user@cassandra.apache.org
Betreff: Re: TTL on SecondaryIndex Columns. A bug?

I'm running on a single node on my laptop.
It looks like the point when rows dissapear from the index depends on JVM memory settings.
With more memory it needs more data to feed in before things start disappearing.
Please try to run Cassandra with -Xms1927M -Xmx1927M -Xmn400M

To be sure, try to get rows for 'indexedColumn'='1':

[default@ks123] get cf1 where 'indexedColumn'='1';

0 Row Returned.

Thanks


On 19 December 2012 05:15, aaron morton <aaron@thelastpickle.com> wrote:
> Thanks for the nice steps to reproduce.
>
> I ran this on my MBP using C* 1.1.7 and got the expected results, both 
> get's returned a row.
>
> Were you running against a single node or a cluster ? If a cluster did 
> you change the CL, cassandra-cli defaults to ONE.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/12/2012, at 9:44 PM, Alexei Bakanov <russisk@gmail.com> wrote:
>
> Hi,
>
> We are having an issue with TTL on Secondary index columns. We get 0 
> rows in return when running queries on indexed columns that have TTL.
> Everything works fine with small amounts of data, but when we get over 
> a ceratin threshold it looks like older rows dissapear from the index.
> In the example below we create 70 rows with 45k columns each + one 
> indexed column with just the rowkey as value, so we have one row per 
> indexed value. When the script is finished the index contains rows 
> 66-69. Rows 0-65 are gone from the index.
> Using 'indexedColumn' without TTL fixes the problem.
>
>
> ------------- SCHEMA START ----------------- create keyspace ks123  
> with placement_strategy = 'NetworkTopologyStrategy'
>  and strategy_options = {datacenter1 : 1}  and durable_writes = true;
>
> use ks123;
>
> create column family cf1
>  with column_type = 'Standard'
>  and comparator = 'AsciiType'
>  and default_validation_class = 'AsciiType'
>  and key_validation_class = 'AsciiType'
>  and read_repair_chance = 0.1
>  and dclocal_read_repair_chance = 0.0
>  and gc_grace = 864000
>  and min_compaction_threshold = 4
>  and max_compaction_threshold = 32
>  and replicate_on_write = true
>  and compaction_strategy =
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>  and caching = 'KEYS_ONLY'
>  and column_metadata = [
>    {column_name : 'indexedColumn',
>    validation_class : AsciiType,
>    index_name : 'INDEX1',
>    index_type : 0}]
>  and compression_options = {'sstable_compression' :
> 'org.apache.cassandra.io.compress.SnappyCompressor'};
> ------------- SCHEMA FINISH -----------------
>
> ------------- POPULATE START ----------------- from pycassa.batch 
> import Mutator import pycassa
>
> pool = pycassa.ConnectionPool('ks123') cf = pycassa.ColumnFamily(pool, 
> 'cf1')
>
> for rowKey in xrange(70):
>    b = Mutator(pool)
>    for datapoint in xrange(1, 45001):
>        b.insert(cf,str(rowKey), {str(datapoint): 'val'}, ttl=7884000);
>    b.insert(cf, str(rowKey), {'indexedColumn': str(rowKey)}, ttl=7887600);
>    print 'row %d' % rowKey
>    b.send()
>    b = Mutator(pool)
>
> pool.dispose()
> ------------- POPULATE FINISH -----------------
>
> ------------- QUERY START ----------------- [default@ks123] get cf1 
> where 'indexedColumn'='65';
>
> 0 Row Returned.
> Elapsed time: 2.38 msec(s).
>
> [default@ks123] get cf1 where 'indexedColumn'='66';
> -------------------
> RowKey: 66
> => (column=1, value=val, timestamp=1355818765548964, ttl=7884000) ...
> => (column=10087, value=val, timestamp=1355818766075538, ttl=7884000) 
> => (column=indexedColumn, value=66, timestamp=1355818768119334, 
> ttl=7887600)
>
> 1 Row Returned.
> Elapsed time: 31 msec(s).
> ------------- QUERY FINISH -----------------
>
> This is all using Cassandra 1.1.7 with default settings.
>
> Best regards,
>
> Alexei Bakanov
>
>



Mime
View raw message