cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jun Rao (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-286) slice offset breaks read repair
Date Thu, 09 Jul 2009 15:32:14 GMT


Jun Rao commented on CASSANDRA-286:

This is a problem with any APIs relying on offset, instead of value. All columns before the
offset affect the outcome. So, if there is any incorrect column (whether it's missing deletes
or missing inserts) before the offset doesn't get fixed immediately, the outcome will be incorrect.

One potential fix is to include all columns before offset in the repair logic, but not in
thrift return. This won't affect performance much since we already have to scan those columns.
This may complicates the overall logic a bit though.

> slice offset breaks read repair
> -------------------------------
>                 Key: CASSANDRA-286
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
> [code]
>         int liveColumns = 0;
>         int limit = offset + count;
>         for (IColumn column : reducedColumns)
>         {
>             if (liveColumns >= limit)
>                 break;
>             if (!finish.isEmpty()
>                 && ((isAscending && >
>                     || (!isAscending && <
>                 break;
>             if (!column.isMarkedForDelete())
>                 liveColumns++;
>             if (liveColumns > offset)
>                 returnCF.addColumn(column);
>         }
> [code]
> The problem is that for offset to return the correct "live" columns, it has to ignore
tombstones it scans before the first live one post-offset.
> This means that instead of being corrected within a few ms of a read, a node can continue
returning deleted data indefinitely (until the next anti-entropy pass).
> Coupled with offset's inherent inefficiency (see CASSANDRA-261) I think this means we
should take it out and leave offset to be computed client-side (which, for datasets under
which it was reasonable server-side, will still be reasonable).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message