cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Olsson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes
Date Fri, 24 Apr 2015 15:18:42 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511202#comment-14511202
] 

Marcus Olsson commented on CASSANDRA-5220:
------------------------------------------

Yes I ran the dtest and I see these exceptions as well while running it.

The tests I ran before was very basic with three nodes and using the stress tool with the
cqlstress-example.yaml profile(changing the replication factor to two) and then ran it with
n=1000000. Then I stopped a node, removed the inserted data and all commitlog entries, started
it again and ran a full repair on that node using `repair -full -- stresscql`.


The main problem seems to be that it runs out of TreeRange's to iterate over while doing the
validation compaction. I have probably done a faulty assumption somewhere and the first thing
that comes to mind is that the wrapping iterator is sorting the ranges in a different order
compared to how the validation compaction is reading them. Unfortunately I don't have time
to debug this further until Monday.

> Repair improvements when using vnodes
> -------------------------------------
>
>                 Key: CASSANDRA-5220
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Brandon Williams
>              Labels: performance, repair
>         Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, cassandra-3.0-5220-1.patch,
cassandra-3.0-5220.patch
>
>
> Currently when using vnodes, repair takes much longer to complete than without them.
 This appears at least in part because it's using a session per range and processing them
sequentially.  This generates a lot of log spam with vnodes, and while being gentler and lighter
on hard disk deployments, ssd-based deployments would often prefer that repair be as fast
as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message