cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-5244) Compactions don't work while node is bootstrapping
Date Thu, 14 Feb 2013 04:28:12 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Brandon Williams updated CASSANDRA-5244:
----------------------------------------

    Reviewer: vijay2win@yahoo.com
    
> Compactions don't work while node is bootstrapping
> --------------------------------------------------
>
>                 Key: CASSANDRA-5244
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5244
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Jouni Hartikainen
>            Assignee: Brandon Williams
>            Priority: Critical
>              Labels: gossip
>             Fix For: 1.2.2
>
>         Attachments: 5244.txt
>
>
> It seems that there is a race condition in StorageService that prevents compactions from
completing while node is in a bootstrap state.
> I have been able to reproduce this multiple times by throttling streaming throughput
to extend the bootstrap time while simultaneously inserting data to the cluster.
> The problems lies in the synchronization of initServer(int delay) and reportSeverity(double
incr) methods as they both try to acquire the instance lock of StorageService through the
use of synchronized keyword. As initServer does not return until the bootstrap has completed,
all calls to reportSeverity will block until that. However, reportSeverity is called when
starting compactions in CompactionInfo and thus all compactions block until bootstrap completes.

> This might severely degrade node's performance after bootstrap as it might have lots
of compactions pending while simultaneously starting to serve reads.
> I have been able to solve the issue by adding a separate lock for reportSeverity and
removing its class level synchronization. This of course is not a valid approach if we must
assume that any of Gossiper's IEndpointStateChangeSubscribers could potentially end up calling
back to StorageService's synchronized methods. However, at least at the moment, that does
not seem to be the case.
> Maybe somebody with more experience about the codebase comes up with a better solution?
> (This might affect DynamicEndpointSnitch as well, as it also calls to reportSeverity
in its setSeverity method)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message