cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3362) allow sub-row repair
Date Wed, 01 Aug 2012 22:21:04 GMT


Jonathan Ellis commented on CASSANDRA-3362:

Notes from chat:

When we repair [0, 1000], we agree on some level for the merkle tree, say 2, and we say the
merkle tree leaves will be [0, 250], [250, 500], [500, 750], [750, 1000]
then each node calculate the hash for those leave base on their keys, and we compare.

We could make it a two step process, where everyone starts w/ the power of 2 tree, but then
A can say "i have row 10 with a billion columns, let's subdivide [0, 250] into [0, (10, 500000000)]
and [(10, 500000000), 250].

The drawback then is that you will do a first validation pass to agree on the subdivisions,
then another to compute the actual hashes.

Or, we could first do a merkle tree as we do now, then for the ranges that differ, if we know
they cover lots of columns (which can be computed easily initially), we could compute smaller
hash ranges before streaming anything.  You'd still read everything twice in the worst case,
but if most rows are small then you don't need to read much the second time.

In the meantime, if you can shard your huge rows instead at the app level that will work better.
> allow sub-row repair
> --------------------
>                 Key: CASSANDRA-3362
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>              Labels: repair
> With large rows, it would be nice to not have to send an entire row if a small part is
out of sync.  Could we use the row index blocks as repair atoms instead of the full row?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message