cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhong Li <>
Subject Re: data deleted came back after 9 days.
Date Wed, 18 Aug 2010 02:49:04 GMT
Those data were inserted one node, then deleted on a remote node in  
less than 2 seconds. So it is very possible some node lost tombstone  
when connection lost.
My question, is a ConstencyLevel.ALL read can retrieve lost tombstone  
back instead of repair?

On Aug 17, 2010, at 4:11 PM, Ned Wolpert wrote:

> (gurus, please check my logic here... I'm trying to validate my  
> understanding of this situation.)
> Isn't the issue that while a server was disconnected, a delete could  
> have occurred, and thus the disconnected server never got the  
> 'tombstone'?
> (  When it comes  
> back, only after it receives the delete request will the data be  
> deleted from the reconnected server.  I do not think this happens  
> automatically when the server rejoins the cluster, but requires the  
> manual repair command.
> From my understanding, if the consistency level is greater then the  
> number of servers missing that tombstone, you'll get the correct  
> data. If its less, then you 'could' get the right or wrong answer.  
> So the issue is how often do you need to run repair? If you have a  
> ReplicationFactor=3, and you use ConstencyLevel.QUORUM, (2  
> responses) then you need to run it after one server fails just to be  
> sure. If you can handle some tolerance for this, you can wait a bit  
> more before running the repair.
> On Tue, Aug 17, 2010 at 12:58 PM, Jeremy Dunck <>  
> wrote:
> On Tue, Aug 17, 2010 at 2:49 PM, Jonathan Ellis <>  
> wrote:
> > It doesn't have to be disconnected more than GC grace seconds to  
> cause
> > what you are seeing, it just has to be disconnected at all (thus
> > missing delete commands).
> >
> > Thus you need to be running repair more often than gcgrace, or
> > confident that read repair will handle it for you (which clearly is
> > not the case for you :).  see
> >
> FWIW, the docs there say:
> "Remember though that if a node is down longer than your configured
> GCGraceSeconds (default: 10 days), it could have missed remove
> operations permanently"
> So that's probably a source of misunderstanding.
> -- 
> Virtually, Ned Wolpert
> "Settle thy studies, Faustus, and begin..."   --Marlowe

View raw message