incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: secondery indexes TTL - strange issues
Date Mon, 17 Sep 2012 01:45:51 GMT
>  Date gets inserted and accessible via index query for some time. At some point in time
Indexes are completely empty and start filling again (while new data enters the system).
If you can reproduce this please create a ticket on https://issues.apache.org/jira/browse/CASSANDRA
. 

If you can include DEBUG level logs that would be helpful. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/09/2012, at 10:08 PM, Roland Gude <roland.gude@ez.no> wrote:

> I am not sure it is compacting an old file: the same thing happens eeverytime I rebuild
the index. New Files appear, get compacted and vanish.
>  
> We have set up a new smaller cluster with fresh data. Same thing happens here as well.
Date gets inserted and accessible via index query for some time. At some point in time Indexes
are completely empty and start filling again (while new data enters the system).
>  
> I am currently testing with SizeTiered on both the fresh set and the imported set.
>  
> For the fresh set (which is significantly smaller) first results imply that the issue
is not happening with SizeTieredCompaction – I have not yet tested everything that comes
into my mind and will update if something new comes up.
>  
> As for the failing query it is from the cli:
> get EventsByItem where 00000003-0000-1000-0000-000000000000=utf8(‘someValue’);
> 00000003-0000-1000-0000-000000000000 is a TUUID we use as a marker for a TimeSeries.
> (and equivalent queries with astyanax and hector as well)
>  
> This is a cf with the issue:
>  
> create column family EventsByItem
>   with column_type = 'Standard'
>   and comparator = 'TimeUUIDType'
>   and default_validation_class = 'BytesType'
>   and key_validation_class = 'BytesType'
>   and read_repair_chance = 0.5
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 864000
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 32
>   and replicate_on_write = true
>   and compaction_strategy = 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
>   and caching = 'NONE'
>   and column_metadata = [
>     {column_name : '00000000-0000-1000-0000-000000000000',
>     validation_class : BytesType,
>     index_name : 'ebi_mandatorIndex',
>     index_type : 0},
>     {column_name : '00000002-0000-1000-0000-000000000000',
>     validation_class : BytesType,
>     index_name : 'ebi_itemidIndex',
>     index_type : 0},
>     {column_name : '00000003-0000-1000-0000-000000000000',
>     validation_class : BytesType,
>     index_name : 'ebi_eventtypeIndex',
>     index_type : 0}]
>   and compression_options={sstable_compression:SnappyCompressor, chunk_length_kb:64};
>  
> Von: aaron morton [mailto:aaron@thelastpickle.com] 
> Gesendet: Freitag, 14. September 2012 10:46
> An: user@cassandra.apache.org
> Betreff: Re: secondery indexes TTL - strange issues
>  
> INFO [CompactionExecutor:181] 2012-09-13 12:58:37,443 CompactionTask.java (line
> 221) Compacted to [/var/lib/cassandra/data/Eventstore/EventsByItem/Eventstore-E
> ventsByItem.ebi_eventtypeIndex-he-10-Data.db,].  78,623,000 to 373,348 (~0% of o
> riginal) bytes for 83 keys at 0.000280MB/s.  Time: 1,272,883ms.
> There is a lot of weird things here. 
> It could be levelled compaction compacting an older file for the first time. But that
would be a guess. 
>  
> Rebuilding the index gives us back the data for a couple of minutes - then it vanishes
again.
> Are you able to do a test with SiezedTieredCompaction ? 
>  
> Are you able to replicate the problem with a fresh testing CF and some test Data?
>  
> If it's only a problem with imported data can you provide a sample of the failing query
? Any maybe the CF definition ? 
>  
> Cheers
>  
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 14/09/2012, at 2:46 AM, Roland Gude <roland.gude@ez.no> wrote:
> 
> 
> Hi,
>  
> we have been running a system on Cassandra 0.7 heavily relying on secondary indexes for
columns with TTL.
> This has been working like a charm, but we are trying hard to move forward with Cassandra
and are struggling at that point:
>  
> When we put our data into a new cluster (any 1.1.x version – currently 1.1.5) , rebuild
indexes and run our system, everything seems to work good – until in some point of time
index queries do not return any data at all anymore (note that the TTL has not yet expired
for several months).
> Rebuilding the index gives us back the data for a couple of minutes - then it vanishes
again.
>  
> What seems strange is that compaction apparently is very aggressive:
>  
> INFO [CompactionExecutor:181] 2012-09-13 12:58:37,443 CompactionTask.java (line
> 221) Compacted to [/var/lib/cassandra/data/Eventstore/EventsByItem/Eventstore-E
> ventsByItem.ebi_eventtypeIndex-he-10-Data.db,].  78,623,000 to 373,348 (~0% of o
> riginal) bytes for 83 keys at 0.000280MB/s.  Time: 1,272,883ms.
>  
>  
> Actually we have switched to LeveledCompaction. Could it be that leveled compaction does
not play nice with indexes?
>  
>  


Mime
View raw message