cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Minh Do (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5263) Allow Merkle tree maximum depth to be configurable
Date Mon, 03 Feb 2014 07:18:10 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889277#comment-13889277
] 

Minh Do commented on CASSANDRA-5263:
------------------------------------

Using some generated sstable files within a token range, I ran a test  on building the Merkle
tree at 20 depth and then add the computed hash values for rows (69M added rows).  These 2
steps together are equivalent to a validation compaction process on a token range if I am
not missing anything.

1. Tree building uses, on the average, 15-18% total CPU resources, and no I/O
2. SSTables scanning and row hash computation use, on the average, 10-12% total 
   CPU resources, and I/O resources limited by the configurable global compaction rate limiter



Given the Jonathan's pointer on using SSTR.estimatedKeysForRanges() to calculate number of
rows for a SSTable file and no overlapping among SSTable files (worst case), we can estimate
how many data rows in a given token range.  

>From what I understood, here is the formula to calculate the Merkle tree's depth (assuming
each data row has a unique hash value):

1. If number of rows from all SSTables in a given range is approximately equal to the maximum
number of hash entries in that range (subject to a CF's partitioner), thenvwe build the tree
at 20 level depth (in the densest case)
2. When number of rows from all SSTables in a given range does not cover the full hash range
or in sparse case, we build a Merkle tree with a depth less than 20. How do we come up with
the right depth?  
     depth = 20 * (n rows / max rows) 
where n is the total number of rows in all SSTables and max is the maximum number of hash
entries in that token range.

However, since different partitions give different max numbers, is there anything we can assume
to make it easy here like assuming all partitions would have the same hash entries in a given
token range?  

> Allow Merkle tree maximum depth to be configurable
> --------------------------------------------------
>
>                 Key: CASSANDRA-5263
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5263
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Config
>    Affects Versions: 1.1.9
>            Reporter: Ahmed Bashir
>            Assignee: Minh Do
>
> Currently, the maximum depth allowed for Merkle trees is hardcoded as 15.  This value
should be configurable, just like phi_convict_treshold and other properties.
> Given a cluster with nodes responsible for a large number of row keys, Merkle tree comparisons
can result in a large amount of unnecessary row keys being streamed.
> Empirical testing indicates that reasonable changes to this depth (18, 20, etc) don't
affect the Merkle tree generation and differencing timings all that much, and they can significantly
reduce the amount of data being streamed during repair. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message