cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Low (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-2039) LazilyCompactedRow doesn't add CFInfo to digest
Date Wed, 26 Jan 2011 12:37:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986984#action_12986984
] 

Richard Low commented on CASSANDRA-2039:
----------------------------------------

I added assertDigest (see the patch trunk-2038-LazilyCompactedRowTest.txt), but it currently
fails in testManyRows.  The reason is that the Bloom filters have different sizes, because
getEstimatedColumnCount returns a value too large in LazilyCompactedRow.  There is probably
no way round this without doing another pass on the data.

However, it isn't necessary to add the Bloom filter or indeed the index to the digest - they
are determined by the data that comes later.  So could the header be excluded from the digest?

> LazilyCompactedRow doesn't add CFInfo to digest
> -----------------------------------------------
>
>                 Key: CASSANDRA-2039
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2039
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Richard Low
>            Priority: Minor
>             Fix For: 0.7.2
>
>         Attachments: trunk-2038-LazilyCompactedRowTest.txt, trunk-2038.txt
>
>
> LazilyCompactedRow.update doesn't add the CFInfo or columnCount to the digest, so the
hash value in the Merkle tree does not include this data.  However, PrecompactedRow does include
this.  Two consequences of this are:
> * Row-level tombstones are not compared when using LazilyCompactedRow so could remain
inconsistent
> * LazilyCompactedRow and PrecompactedRow produce different hashes of the same row, so
if two nodes have differing in_memory_compaction_limit_in_mb values, rows of size in between
the two limits will have different hashes so will always be repaired even when they are the
same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message