cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-2253) Gossiper Starvation
Date Sun, 27 Feb 2011 22:45:37 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000056#comment-13000056
] 

Jonathan Ellis commented on CASSANDRA-2253:
-------------------------------------------

Sounds good to me.  Were you going to submit a patch?

> Gossiper Starvation
> -------------------
>
>                 Key: CASSANDRA-2253
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2253
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>         Environment: linux, windows
>            Reporter: Mikael Sitruk
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Gossiper periodic task will get into starvation in case large sstable files need to be
deleted.
> Indeed the SSTableDeletingReference uses the same scheduledTasks pool (from StorageService)
as the Gossiper and other periodic tasks, but the gossiper tasks should run each second to
assure correct cluster status (liveness of nodes). In case of large sstable files to be deleted
(several GB) the delete operation can take more than 30 sec, thus making the whole cluster
going into a wrong state where nodes are marked as not living while they are!
> This will lead to unneeded additional load like hinted hand off, wrong cluster state,
increase in latency.
> One of the possible solution is to use a separate pool for periodic and non periodic
tasks. 
> I've implemented such change and it resolves the problem. 
> I can provide a patch 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message