cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.
Date Tue, 22 Apr 2014 18:02:18 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977125#comment-13977125
] 

Benedict edited comment on CASSANDRA-6696 at 4/22/14 6:01 PM:
--------------------------------------------------------------

I may be misunderstanding your proposal: I assume you mean assign the vnodes _to each disk_
via knapsack? In which case your balance per disk is based solely on the knapsack. If the
_cluster wide_ vnode allocation is designed specifically to ensure that any given range will
maintain the property I gave (i.e. that any _adjacent_ N will be within some proportion of
the ideal ownership proportion) then the balance is based on that and will continue to be
true no matter how many nodes are added to the cluster, whereas you will have to re-knapsack
each time the ownership range changes.


was (Author: benedict):
I may be misunderstanding your proposal: I assume you mean assign the vnodes _to each disk_
via knapsack? In which case your balance per disk is based solely on the knapsack. If the
_cluster wide_ vnode allocation is designed specifically to ensure that any given range will
maintain the property I gave (i.e. that any N will be within some proportion of the ideal
ownership proportion) then the balance is based on that and will continue to be true no matter
how many nodes are added to the cluster, whereas you will have to re-knapsack each time the
ownership range changes.

> Drive replacement in JBOD can cause data to reappear. 
> ------------------------------------------------------
>
>                 Key: CASSANDRA-6696
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: sankalp kohli
>            Assignee: Marcus Eriksson
>             Fix For: 3.0
>
>
> In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one
and repair is run. 
> This can cause deleted data to come back in some cases. Also this is true for corrupt
stables in which we delete the corrupt stable and run repair. 
> Here is an example:
> Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
> row=sankalp col=sankalp is written 20 days back and successfully went to all three nodes.

> Then a delete/tombstone was written successfully for the same row column 15 days back.

> Since this tombstone is more than gc grace, it got compacted in Nodes A and B since it
got compacted with the actual data. So there is no trace of this row column in node A and
B.
> Now in node C, say the original data is in drive1 and tombstone is in drive2. Compaction
has not yet reclaimed the data and tombstone.  
> Drive2 becomes corrupt and was replaced with new empty drive. 
> Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp has come
back to life. 
> Now after replacing the drive we run repair. This data will be propagated to all nodes.

> Note: This is still a problem even if we run repair every gc grace. 
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message