cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-2841) Always use even distribution for merkle tree with RandomPartitionner
Date Wed, 29 Jun 2011 19:32:31 GMT


Jonathan Ellis commented on CASSANDRA-2841:


> Always use even distribution for merkle tree with RandomPartitionner
> --------------------------------------------------------------------
>                 Key: CASSANDRA-2841
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>            Priority: Trivial
>              Labels: repair
>             Fix For: 0.7.7, 0.8.2
>         Attachments: 2841.patch
> When creating the initial merkle tree, repair tries to be (too) smart and use the key
samples to "guide" the tree splitting. While this is a good idea for OPP where there is a
good change the data distribution is uneven, you can't beat an even distribution for the RandomPartitionner.
And a quick experiment even shows that the method used is significantly less efficient than
an even distribution for the ranges of the merkle tree (that is, an even distribution gives
a much better of distribution of the number of keys by range of the tree).
> Thus let's switch to an even distribution for RandomPartitionner. That 3 lines change
alone amounts for a significant improvement of repair's precision.

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message