cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
Date Thu, 01 Sep 2011 15:26:10 GMT


Sylvain Lebresne commented on CASSANDRA-2698:

An EstimatedHistogram would be just fine. That plus for each pair of merkle tree, the number
of ranges that differs and the corresponding streamed size of the data would be a very good
start imho.

I think the only thing we need to figure out for this patch is where it makes the most sense
to record that data. What I mean here is that the merkle trees are computed on each node participating
in a repair (and thus that is where the EstimatedHistogram can be computed), while the computing
of the differences is only done on the coordinator. But on an ideal world, it would seem more
useful to store those information together (for a given repair) because they are related.

> Instrument repair to be able to assess it's efficiency (precision)
> ------------------------------------------------------------------
>                 Key: CASSANDRA-2698
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Priority: Minor
>              Labels: lhf
> Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis
is that the merkle tree precision may deteriorate too much at some data size. To check this
hypothesis, it would be reasonably to gather statistic during the merkle tree building of
how many rows each merkle tree range account for (and the size that this represent). It is
probably an interesting statistic to have anyway.   

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message