cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuki Morishita (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2698) Instrument repair to be able to assess it's efficiency (precision)
Date Tue, 09 Apr 2013 04:34:16 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626208#comment-13626208
] 

Yuki Morishita commented on CASSANDRA-2698:
-------------------------------------------

[~benedict]

hmm, it is not clear to me why you create EstimatedHistogram of size of differences. Sometimes
I see more than 1000 of differences for large clusters. You should just create it with reasonable
bucket count. You don't have to keep every size for every range.

Using logging for outputting statistic is fine at this point, but I think we should come up
with other way so that it is easy to see all the related logs and statistics about certain
repair session. I don't have specific idea yet though. (Maybe another system cf similar to
Tracing?)

nits:

- fix coding style (especially whitespace) to match other code.
- EstimatedHistogram#testGroupBy is failing.
- comparator in Arrays#sort in EstimatedHistogram#logSummary has the same conditions in both
if and else if.
                
> Instrument repair to be able to assess it's efficiency (precision)
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-2698
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Assignee: Benedict
>            Priority: Minor
>              Labels: lhf
>         Attachments: nodetool_repair_and_cfhistogram.tar.gz, patch_2698_v1.txt, patch.diff,
patch-rebased.diff
>
>
> Some reports indicate that repair sometime transfer huge amounts of data. One hypothesis
is that the merkle tree precision may deteriorate too much at some data size. To check this
hypothesis, it would be reasonably to gather statistic during the merkle tree building of
how many rows each merkle tree range account for (and the size that this represent). It is
probably an interesting statistic to have anyway.   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message