cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: data deleted came back after 9 days.
Date Wed, 18 Aug 2010 13:17:55 GMT
Best practice is to schedule repair more often than GCGraceSeconds,
say weekly, rather than doing it manually when you notice the FD mark
someone dead.

On Tue, Aug 17, 2010 at 3:11 PM, Ned Wolpert <ned.wolpert@imemories.com> wrote:
> (gurus, please check my logic here... I'm trying to validate my
> understanding of this situation.)
> Isn't the issue that while a server was disconnected, a delete could have
> occurred, and thus the disconnected server never got the 'tombstone'?
> (http://wiki.apache.org/cassandra/DistributedDeletes)  When it comes back,
> only after it receives the delete request will the data be deleted from the
> reconnected server.  I do not think this happens automatically when the
> server rejoins the cluster, but requires the manual repair command.
> From my understanding, if the consistency level is greater then the number
> of servers missing that tombstone, you'll get the correct data. If its less,
> then you 'could' get the right or wrong answer. So the issue is how often do
> you need to run repair? If you have a ReplicationFactor=3, and you use
> ConstencyLevel.QUORUM, (2 responses) then you need to run it after one
> server fails just to be sure. If you can handle some tolerance for this, you
> can wait a bit more before running the repair.
> On Tue, Aug 17, 2010 at 12:58 PM, Jeremy Dunck <jdunck@gmail.com> wrote:
>>
>> On Tue, Aug 17, 2010 at 2:49 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>> > It doesn't have to be disconnected more than GC grace seconds to cause
>> > what you are seeing, it just has to be disconnected at all (thus
>> > missing delete commands).
>> >
>> > Thus you need to be running repair more often than gcgrace, or
>> > confident that read repair will handle it for you (which clearly is
>> > not the case for you :).  see
>> > http://wiki.apache.org/cassandra/Operations
>>
>> FWIW, the docs there say:
>> "Remember though that if a node is down longer than your configured
>> GCGraceSeconds (default: 10 days), it could have missed remove
>> operations permanently"
>>
>> So that's probably a source of misunderstanding.
>
>
>
> --
> Virtually, Ned Wolpert
>
> "Settle thy studies, Faustus, and begin..."   --Marlowe
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message