cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8683) Incremental repairs broken with early opening of compaction results
Date Tue, 27 Jan 2015 12:09:34 GMT


Benedict commented on CASSANDRA-8683:

I think this might still not be quite right, but we're on the right track. This also highlights
another bug (CASSANDRA-8691).

I suspect it may be the mistake that we are passing in .getToken().maxKeyBound() to getPosition
- it could be we are looking up a value beyond that present in the index, because there are
multiple adjacent keys with the same token, and we are looking past all of them.

That said, if we fix CASSANDRA-8691 we will need to do something about the last index position.
There are two possibilities: 1) Like suggested a while back, we could simply ignore the last
record for purposes of getPositionsForRanges(), since our paired source file will contain
the row. Or 2) during index construction we could perhaps retain a lookup of the records immediately
following an index boundary, for the past few boundaries only. We could then use this as our
last key instead. I'm not sure which I prefer, as 1) creates some risk it will not be accounted
for in future; 2) creates some unnecessary complexity.

> Incremental repairs broken with early opening of compaction results
> -------------------------------------------------------------------
>                 Key: CASSANDRA-8683
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>             Fix For: 2.1.3
>         Attachments: 0001-avoid-NPE-in-getPositionsForRanges.patch
> Incremental repairs holds a set of the sstables it started the repair on (we need to
know which sstables were actually validated to be able to anticompact them). This includes
any tmplink files that existed when the compaction started (if we wouldn't include those,
we would miss data since we move the start point of the existing non-tmplink files)
> With CASSANDRA-6916 we swap out those instances with new ones (SSTR.cloneWithNewStart
/ SSTW.openEarly), meaning that the underlying file can get deleted even though we hold a
> This causes the unit test error:
> (note that it only fails on trunk though, in 2.1 we don't hold references to the repairing
files for non-incremental repairs, but the bug should exist in 2.1 as well)

This message was sent by Atlassian JIRA

View raw message