cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcus Olsson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11215) Reference leak with parallel repairs on the same table
Date Thu, 25 Feb 2016 13:13:18 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167181#comment-15167181
] 

Marcus Olsson commented on CASSANDRA-11215:
-------------------------------------------

I've created a dtest for it [here|https://github.com/emolsson/cassandra-dtest/commit/adbf51a8b17f6923b6fbd3d7a511399afd695738],
which fails on 2.2.x and trunk with LEAK DETECTED, while working on 2.2 with the provided
patch.

I've found one problem though, it doesn't seem like trunk always gets the "Cannot start multiple
repairs" error message. It could be due to how it groups the ranges together and only has
a single repair session while 2.2 has one repair session per range. On trunk the "LEAK DETECTED"
is only logged once and on 2.2 it's logged multiple times, so it could be that the likelihood
of getting the error is reduced on trunk since it only has one repair session? Should we handle
this by flaky or should we increase the size of the data and make the validation run longer?

> Reference leak with parallel repairs on the same table
> ------------------------------------------------------
>
>                 Key: CASSANDRA-11215
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11215
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Marcus Olsson
>            Assignee: Marcus Olsson
>
> When starting multiple repairs on the same table Cassandra starts to log about reference
leak as:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-02-23 15:02:05,516 Ref.java:187 - LEAK DETECTED: a reference
(org.apache.cassandra.utils.concurrent.Ref$State@5213f926) to class org.apache.cassandra.io.sstable.format.SSTableReader
> $InstanceTidier@605893242:.../testrepair/standard1-dcf311a0da3411e5a5c0c1a39c091431/la-30-big
was not released before the reference was garbage collected
> {noformat}
> Reproducible with:
> {noformat}
> ccm create repairtest -v 2.2.5 -n 3
> ccm start
> ccm stress write n=1000000 -schema replication(strategy=SimpleStrategy,factor=3) keyspace=testrepair
> # And then perform two repairs concurrently with:
> ccm node1 nodetool repair testrepair
> {noformat}
> I know that starting multiple repairs in parallel on the same table isn't very wise,
but this shouldn't result in reference leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message