incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roland Gude <roland.g...@ez.no>
Subject AW: secondery indexes TTL - strange issues
Date Fri, 14 Sep 2012 10:08:18 GMT
I am not sure it is compacting an old file: the same thing happens eeverytime I rebuild the
index. New Files appear, get compacted and vanish.

We have set up a new smaller cluster with fresh data. Same thing happens here as well. Date
gets inserted and accessible via index query for some time. At some point in time Indexes
are completely empty and start filling again (while new data enters the system).

I am currently testing with SizeTiered on both the fresh set and the imported set.

For the fresh set (which is significantly smaller) first results imply that the issue is not
happening with SizeTieredCompaction - I have not yet tested everything that comes into my
mind and will update if something new comes up.

As for the failing query it is from the cli:
get EventsByItem where 00000003-0000-1000-0000-000000000000=utf8('someValue');
00000003-0000-1000-0000-000000000000 is a TUUID we use as a marker for a TimeSeries.
(and equivalent queries with astyanax and hector as well)

This is a cf with the issue:

create column family EventsByItem
  with column_type = 'Standard'
  and comparator = 'TimeUUIDType'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'BytesType'
  and read_repair_chance = 0.5
  and dclocal_read_repair_chance = 0.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and compaction_strategy = 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
  and caching = 'NONE'
  and column_metadata = [
    {column_name : '00000000-0000-1000-0000-000000000000',
    validation_class : BytesType,
    index_name : 'ebi_mandatorIndex',
    index_type : 0},
    {column_name : '00000002-0000-1000-0000-000000000000',
    validation_class : BytesType,
    index_name : 'ebi_itemidIndex',
    index_type : 0},
    {column_name : '00000003-0000-1000-0000-000000000000',
    validation_class : BytesType,
    index_name : 'ebi_eventtypeIndex',
    index_type : 0}]
  and compression_options={sstable_compression:SnappyCompressor, chunk_length_kb:64};

Von: aaron morton [mailto:aaron@thelastpickle.com]
Gesendet: Freitag, 14. September 2012 10:46
An: user@cassandra.apache.org
Betreff: Re: secondery indexes TTL - strange issues

INFO [CompactionExecutor:181] 2012-09-13 12:58:37,443 CompactionTask.java (line
221) Compacted to [/var/lib/cassandra/data/Eventstore/EventsByItem/Eventstore-E
ventsByItem.ebi_eventtypeIndex-he-10-Data.db,].  78,623,000 to 373,348 (~0% of o
riginal) bytes for 83 keys at 0.000280MB/s.  Time: 1,272,883ms.
There is a lot of weird things here.
It could be levelled compaction compacting an older file for the first time. But that would
be a guess.

Rebuilding the index gives us back the data for a couple of minutes - then it vanishes again.
Are you able to do a test with SiezedTieredCompaction ?

Are you able to replicate the problem with a fresh testing CF and some test Data?

If it's only a problem with imported data can you provide a sample of the failing query ?
Any maybe the CF definition ?

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/09/2012, at 2:46 AM, Roland Gude <roland.gude@ez.no<mailto:roland.gude@ez.no>>
wrote:


Hi,

we have been running a system on Cassandra 0.7 heavily relying on secondary indexes for columns
with TTL.
This has been working like a charm, but we are trying hard to move forward with Cassandra
and are struggling at that point:

When we put our data into a new cluster (any 1.1.x version - currently 1.1.5) , rebuild indexes
and run our system, everything seems to work good - until in some point of time index queries
do not return any data at all anymore (note that the TTL has not yet expired for several months).
Rebuilding the index gives us back the data for a couple of minutes - then it vanishes again.

What seems strange is that compaction apparently is very aggressive:

INFO [CompactionExecutor:181] 2012-09-13 12:58:37,443 CompactionTask.java (line
221) Compacted to [/var/lib/cassandra/data/Eventstore/EventsByItem/Eventstore-E
ventsByItem.ebi_eventtypeIndex-he-10-Data.db,].  78,623,000 to 373,348 (~0% of o
riginal) bytes for 83 keys at 0.000280MB/s.  Time: 1,272,883ms.


Actually we have switched to LeveledCompaction. Could it be that leveled compaction does not
play nice with indexes?




Mime
View raw message