cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Brown (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-12966) Gossip thread slows down when using batch commit log
Date Tue, 29 Nov 2016 18:39:58 GMT


Jason Brown commented on CASSANDRA-12966:

I've chosen to solve this by updating {{SystemKeyspace}} in the following ways:
- remove the {{synchronized}} keyword from the functions that update the {{peers}} table
- on those same functions, execute the update to the {{peers}} asynchronously, on a mutation
stage thread, instead of blocking the current thread (which is often the Gossip stage thread)
- return a {{Future}} from those functions so that tests can behave properly


> Gossip thread slows down when using batch commit log
> ----------------------------------------------------
>                 Key: CASSANDRA-12966
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
> When using batch commit log mode, the Gossip thread slows down when peers after a node
bounces. This is because we perform a bunch of updates to the peers table via {{SystemKeyspace.updatePeerInfo}},
which is a synchronized method. How quickly each one of those individual updates takes depends
on how busy the system is at the time wrt write traffic. If the system is largely quiescent,
each update will be relatively quick (just waiting for the fsync). If the system is getting
a lot of writes, and depending on the commitlog_sync_batch_window_in_ms, each of the Gossip
thread's updates can get stuck in the backlog, which causes the Gossip thread to stop processing.
We have observed in large clusters that a rolling restart causes triggers and exacerbates
this behavior. 

This message was sent by Atlassian JIRA

View raw message