cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2633) Keys get lost in bootstrap
Date Wed, 11 May 2011 14:45:47 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031748#comment-13031748
] 

Jonathan Ellis commented on CASSANDRA-2633:
-------------------------------------------

So say we have a node A with rows A B C D on it.

We bootstrap a node C.

C requests (A, C] from A.

A will do a GT scan starting with A.  So a cache hit will result in [A, C] being transferred
instead. That is a bug, I'll see if I can create a unit test that demonstrates that separately.

But I don't see how this affects the C row?


> Keys get lost in bootstrap
> --------------------------
>
>                 Key: CASSANDRA-2633
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2633
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.5
>            Reporter: Richard Low
>            Priority: Critical
>         Attachments: 0.7-2633.txt
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> When bootstrapping a new node, the key at the upper end of the new node's range can get
lost.  To reproduce:
> * Set up one cassandra node, create a keyspace and column family and perform some inserts
> * Read every row back
> * Bootstrap a second node
> * Read every row back
> You find one row is missing, whose row key is exactly equal to the token the new node
gets (for OPP - for RP it's the key whose hash is equal to the token).  If you don't do the
reads after the inserts, the key is not lost.  I tracked the problem down to o.a.c.io.sstable.SSTableReader
in getPosition.  The problem is that the cached position is used if it is there (so only if
the reads were performed).  But this is incorrect because the cached position is the start
of the row, not the end.  This means the end row itself is not transferred.  This causes the
last key in the range to get lost.
> Although I haven't seen it, this may occur during antientropy repairs too.
> The attached patch (against the 0.7 branch) fixes it by not using the cache for Operator.GT.
 I haven't tested with 0.8 but from looking at the code I think the problem is present.
> This might be related to CASSANDRA-1992

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message