incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhong Li <...@voxeo.com>
Subject Re: data deleted came back after 9 days.
Date Wed, 18 Aug 2010 02:49:04 GMT
Those data were inserted one node, then deleted on a remote node in  
less than 2 seconds. So it is very possible some node lost tombstone  
when connection lost.
My question, is a ConstencyLevel.ALL read can retrieve lost tombstone  
back instead of repair?



On Aug 17, 2010, at 4:11 PM, Ned Wolpert wrote:

> (gurus, please check my logic here... I'm trying to validate my  
> understanding of this situation.)
>
> Isn't the issue that while a server was disconnected, a delete could  
> have occurred, and thus the disconnected server never got the  
> 'tombstone'?
> (http://wiki.apache.org/cassandra/DistributedDeletes)  When it comes  
> back, only after it receives the delete request will the data be  
> deleted from the reconnected server.  I do not think this happens  
> automatically when the server rejoins the cluster, but requires the  
> manual repair command.
>
> From my understanding, if the consistency level is greater then the  
> number of servers missing that tombstone, you'll get the correct  
> data. If its less, then you 'could' get the right or wrong answer.  
> So the issue is how often do you need to run repair? If you have a  
> ReplicationFactor=3, and you use ConstencyLevel.QUORUM, (2  
> responses) then you need to run it after one server fails just to be  
> sure. If you can handle some tolerance for this, you can wait a bit  
> more before running the repair.
>
> On Tue, Aug 17, 2010 at 12:58 PM, Jeremy Dunck <jdunck@gmail.com>  
> wrote:
> On Tue, Aug 17, 2010 at 2:49 PM, Jonathan Ellis <jbellis@gmail.com>  
> wrote:
> > It doesn't have to be disconnected more than GC grace seconds to  
> cause
> > what you are seeing, it just has to be disconnected at all (thus
> > missing delete commands).
> >
> > Thus you need to be running repair more often than gcgrace, or
> > confident that read repair will handle it for you (which clearly is
> > not the case for you :).  see
> > http://wiki.apache.org/cassandra/Operations
>
> FWIW, the docs there say:
> "Remember though that if a node is down longer than your configured
> GCGraceSeconds (default: 10 days), it could have missed remove
> operations permanently"
>
> So that's probably a source of misunderstanding.
>
>
>
> -- 
> Virtually, Ned Wolpert
>
> "Settle thy studies, Faustus, and begin..."   --Marlowe


Mime
View raw message