cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nadav Har'El (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10728) Hash used in repair does not include partition key
Date Sun, 22 Nov 2015 09:13:10 GMT


Nadav Har'El commented on CASSANDRA-10728:

Identical values, yes, but not identical keys...

> Hash used in repair does not include partition key
> --------------------------------------------------
>                 Key: CASSANDRA-10728
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Nadav Har'El
>            Priority: Minor
> When the repair code builds the Merkle Tree, it appears to be using AbstractCompactedRow.update()
to calculate a partition's hash. This method's documentation states that it calculates a "digest
with the data bytes of the row (not including row key or row size).". The code itself seems
to agree with this comment.
> However, I believe that not including the row (actually, partition) key in the hash function
is a mistake: This means that if two nodes have the same data but different key, repair would
not notice this discrepancy. Moreover, if two different keys have their data switched - or
have the same data - again this would not be noticed by repair. Actually running across this
problem in a real repair is not very likely, but I can imagine seeing it easily in an hypothetical
use case where all partitions have exactly the same data and just the partition key matters.
> I am sorry if I'm mistaken and the partition key is actually taken into account in the
Merkle tree, but I tried to find evidence that it does and failed. Glancing over the code,
it almost seems that it does use the key: Validator.add() calculates rowHash() which includes
the digest (without the partition key) *and* the key's token. But then, the code calls MerkleTree.TreeRange.addHash()
on that tuple, and that function conspicuously ignores the token, and only uses the digest.

This message was sent by Atlassian JIRA

View raw message