cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Weird problem with empty CF
Date Tue, 04 Oct 2011 08:27:27 GMT
Yes that's the slice query skipping past the tombstone columns. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 4/10/2011, at 4:24 PM, Daning Wang wrote:

> Lots of SliceQueryFilter in the log, is that handling tombstone?
> 
> DEBUG [ReadStage:49] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317582939743663:true:4@1317582939933000
> DEBUG [ReadStage:50] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317573253148778:true:4@1317573253354000
> DEBUG [ReadStage:43] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317669552951428:true:4@1317669553018000
> DEBUG [ReadStage:33] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317581886709261:true:4@1317581886957000
> DEBUG [ReadStage:52] 2011-10-03 20:15:07,942 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317568165152246:true:4@1317568165482000
> DEBUG [ReadStage:36] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317567265089211:true:4@1317567265405000
> DEBUG [ReadStage:53] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317674324843122:true:4@1317674324946000
> DEBUG [ReadStage:38] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317571990078721:true:4@1317571990141000
> DEBUG [ReadStage:57] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317671855234221:true:4@1317671855239000
> DEBUG [ReadStage:54] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317558305262954:true:4@1317558305337000
> DEBUG [RequestResponseStage:11] 2011-10-03 20:15:07,941 ResponseVerbHandler.java (line
48) Processing response on a callback from 12347@/10.210.101.104
> DEBUG [RequestResponseStage:9] 2011-10-03 20:15:07,941 AbstractRowResolver.java (line
66) Preprocessed data response
> DEBUG [RequestResponseStage:13] 2011-10-03 20:15:07,941 AbstractRowResolver.java (line
66) Preprocessed digest response
> DEBUG [ReadStage:58] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317581337972739:true:4@1317581338044000
> DEBUG [ReadStage:64] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317582656796332:true:4@1317582656970000
> DEBUG [ReadStage:55] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317569432886284:true:4@1317569432984000
> DEBUG [ReadStage:45] 2011-10-03 20:15:07,941 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317572658687019:true:4@1317572658718000
> DEBUG [ReadStage:47] 2011-10-03 20:15:07,940 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317582281617755:true:4@1317582281717000
> DEBUG [ReadStage:48] 2011-10-03 20:15:07,940 SliceQueryFilter.java (line 123) collecting
0 of 1: 1317549607869226:true:4@1317549608118000
> DEBUG [ReadStage:34] 2011-10-03 20:15:07,940 SliceQueryFilter.java (line 123) collecting
0 of 1: 
> On Thu, Sep 29, 2011 at 2:17 PM, aaron morton <aaron@thelastpickle.com> wrote:
> As with any situation involving the un-dead, it really is the number of Zombies, Mummies
or Vampires that is the concern.  
> 
> If you delete data there will always be tombstones. If you have a delete heavy workload
there will be more tombstones. This is why implementing a queue with cassandra is a bad idea.
> 
> gc_grace_seconds (and column TTL) are the *minimum* about of time the tombstones will
stay in the data files, there is no maximum. 
> 
> Your read performance also depends on the number of SSTables the row is spread over,
see http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/
> 
> If you really wanted to purge them then yes a repair and then major compaction would
be the way to go. Also consider if it's possible to design the data model around the problem,
e.g. partitioning rows by date. IMHO I would look to make data model changes before implementing
a compaction policy, or consider if cassandra is the right store if you have a delete heavy
workload.
> 
> Cheers
> 
>  
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 30/09/2011, at 3:27 AM, Daning Wang wrote:
> 
>> Jonathan/Aaron,
>> 
>> Thank you guy's reply, I will change GCGracePeriod to 1 day to see what will happen.
>> 
>> Is there a way to purge tombstones at anytime? because if tombstones affect performance,
we want them to be purged right away, not after GCGracePeriod. We know all the nodes are up,
and we can do repair first to make sure the consistency before purging.
>> 
>> Thanks,
>> 
>> Daning
>> 
>> 
>> On Wed, Sep 28, 2011 at 5:22 PM, aaron morton <aaron@thelastpickle.com> wrote:
>> if I had to guess I would say it was spending time handling tombstones. If you see
it happen again, and are interested, turn the logging up to DEBUG and look for messages from
something starting with "Slice"
>> 
>> Minor (automatic) compaction will, over time, purge the tombstones. Until then reads
must read discard the data deleted by the tombstones. If you perform a big (i.e. 100k's )
delete this can reduce performance until compaction does it's thing.
>> 
>> My second guess would be read repair (or the simple consistency checks on read) kicking
in. That would show up in the "ReadRepairStage" in TPSTATS
>> 
>> it may have been neither of those two things, just guesses. If you have more issues
let us know and provide some more info.
>> 
>> Cheers
>> 
>> 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 29/09/2011, at 6:35 AM, Daning wrote:
>> 
>> > I have an app polling a few CFs (select first N * from CF), there were data
in CFs but later were deleted so CFs were empty for a long time. I found Cassandra CPU usage
was getting high to 80%, normally it uses less than 30%. I issued the select query manually
and feel the response is slow. I have tried nodetool compact/repair for those CFs but that
does not work. later, I issue 'truncate' for all the CFs and CPU usage gets down to 1%.
>> >
>> > Can somebody explain to me why I need to truncate an empty CF? and what else
I could do to bring the CPU usage down?
>> >
>> > I am running 0.8.6.
>> >
>> > Thanks,
>> >
>> > Daning
>> >
>> 
>> 
> 
> 


Mime
View raw message