cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Burruss <bburr...@expedia.com>
Subject Re: deleting rows and tombstones
Date Tue, 14 Feb 2012 22:39:09 GMT
I +1'd CASSANDRA-3620

From: Dominic Williams <dwilliams@fightmymonster.com<mailto:dwilliams@fightmymonster.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tue, 14 Feb 2012 14:30:36 -0800
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: deleting rows and tombstones

Hi that's a good question. Maybe we are hanging on to lessons we shouldn't need but...

Currently even on 1.07 we still get significant deleted data popping when repair hasn't run.

The irony is that it is usually cluster disruption caused by repair-initiated compaction storms
that seem to be the driver of these problems(!) but what that shows is that ideally you still
need to run repair processes

Incidentally, the primary purpose of CASSANDRA-3620 is not to prevent data popping up when
repair hasn't been run, but rather to improve performance by allowing tombstones to be removed
by compaction at the earliest opportunity (or even to be removed before they even make it
into an sstable). As they build up they can affect performance badly. I'd like to see the
end of the whole idea of GCGradeSeconds.

On 14 February 2012 21:29, Todd Burruss <bburruss@expedia.com<mailto:bburruss@expedia.com>>
wrote:
do you find that repair is still as necessary now since hinted handoffs are stored anytime
a node does not ACK successfully?

From: Dominic Williams <dwilliams@fightmymonster.com<mailto:dwilliams@fightmymonster.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tue, 14 Feb 2012 12:31:45 -0800

To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: deleting rows and tombstones

Hi Todd,

Our systems do a lot of deletions and it does cause problems.

Your best bet is to bring GCSeconds low and run repair religiously. The issue you can run
into though is repair overloading your servers when your data load gets high, repair falling
over and related problems.

IMHO the need to run repair really needs to be addressed urgently.

I proposed an alternative approach here https://issues.apache.org/jira/browse/CASSANDRA-3620
so vote it up if you share problems!

Dominic

On 14 February 2012 19:54, Todd Burruss <bburruss@expedia.com<mailto:bburruss@expedia.com>>
wrote:
my design calls for deleting a row (by key, not individual columns) and re-inserting it a
lot and I'm concerned about tombstone build up slowing down reads.  I know if I delete a lot
of individual columns the tombstones will build up and slow down reads until they are cleaned
up, but not sure if the same holds for deleting the whole role.

thoughts?



Mime
View raw message