cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
Date Wed, 05 Aug 2015 03:23:07 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654745#comment-14654745
] 

Stefania commented on CASSANDRA-5220:
-------------------------------------

Quite impressive gain indeed! Thanks for fixing those rebase errors too.

I've merged your branch into mine and rebased so that we can more easily compare the CI results.
As you've noticed some test failures are not related to this patch, so keeping it up-to-date
with trunk makes it easier to compare the test results with trunk ([here|http://cassci.datastax.com/job/trunk_testall]
and [here|http://cassci.datastax.com/job/trunk_dtest]).

I also pushed [another commit|https://github.com/stef1927/cassandra/commit/27615434aec0ce05c2bfa689020b0e00a6409590]
with some very minor changes, mostly nits or comments. There are also a couple of trivial
things to do marked as {{// CR-TODO}}. I prefer not to clatter the discussion with these trivial
matters and to instead focus on the main points, but if upon checking the changes something
concerns you then don't hesitate to raise it.

Here are the main points:

* Do we need to support repair with older replicas? Normally we do support older nodes in
a cluster when changing message formats, that's why we have a version in the serializers.
So unless repair is different we need to make sure we still send the old message format to
the old nodes, which I'm afraid could be a bit of a pain to implement. cc [~jbellis] to confirm.

* In {{MerkleTrees.deserialize()}}: is it safe to use {{MessagingService.globalPartitioner()}}?
{{MerkleTree}} currently serializes the partitioner name so I would have thought we need to
do the same? In fact, why send the range on the wire at all, can we not just take it from
the tree {{fullRange}}?

* In {{MerkleTrees}}: why do we need a separate list of {{Range<Token>}}, isn't a sorted
map like a tree map sufficient? 

* The token ranges should not overlap from what I understand so should we add a couple of
assertions in {{MerkleTrees}} to make sure this is the case? (I'm not sure about this one).

* By reading the code documentation of {{RepairSession}} I found an old ticket, CASSANDRA-2816.
I believe this proposed implementation should be fine as we scan multiple ranges at the same
time in the validation compaction but I did not read the entire discussion on that ticket
and so I thought I'd mention it here.


> Repair improvements when using vnodes
> -------------------------------------
>
>                 Key: CASSANDRA-5220
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Brandon Williams
>            Assignee: Marcus Olsson
>              Labels: performance, repair
>         Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, cassandra-3.0-5220-1.patch,
cassandra-3.0-5220-2.patch, cassandra-3.0-5220.patch
>
>
> Currently when using vnodes, repair takes much longer to complete than without them.
 This appears at least in part because it's using a session per range and processing them
sequentially.  This generates a lot of log spam with vnodes, and while being gentler and lighter
on hard disk deployments, ssd-based deployments would often prefer that repair be as fast
as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message