cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Podkowinski (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12281) Gossip blocks on startup when another node is bootstrapping
Date Mon, 24 Oct 2016 11:11:58 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15601668#comment-15601668
] 

Stefan Podkowinski commented on CASSANDRA-12281:
------------------------------------------------

Would you be ok with my approach in the [WIP branch|https://github.com/apache/cassandra/compare/cassandra-3.0...spodkowinski:WIP-12281],
[~jkni]? 

I've implemented two significant changes here:
# TokenMetaData.calculatePendingRanges() will no longer acquire a read lock for range calculation.
This should be fine as long as this method is the only writer and synchronizes around the
ranges map. Without the lock, it's now possible that range calculation results will be behind
the most recent gossip state, but I think that's still preferred  to avoiding updating the
gossip state at all by blocked handlers. Eventually the calculation should catch up.
# The ranges calculation will now only be done once per identical replication configuration,
which should improve performance for clusters with a large number of keyspaces. It would be
nice if [~apeckys] or [~dikanggu] could confirm if their affected clusters have a larger number
of keyspaces or not, so we get a better idea if this could be a bigger factor at play.

Let me know your comments and I'm going to create patches for 2.2 upwards and fire the tests
in case we're good.

> Gossip blocks on startup when another node is bootstrapping
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-12281
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12281
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Eric Evans
>            Assignee: Stefan Podkowinski
>         Attachments: restbase1015-a_jstack.txt
>
>
> In our cluster, normal node startup times (after a drain on shutdown) are less than 1
minute.  However, when another node in the cluster is bootstrapping, the same node startup
takes nearly 30 minutes to complete, the apparent result of gossip blocking on pending range
calculations.
> {noformat}
> $ nodetool-a tpstats
> Pool Name                    Active   Pending      Completed   Blocked  All time blocked
> MutationStage                     0         0           1840         0              
  0
> ReadStage                         0         0           2350         0              
  0
> RequestResponseStage              0         0             53         0              
  0
> ReadRepairStage                   0         0              1         0              
  0
> CounterMutationStage              0         0              0         0              
  0
> HintedHandoff                     0         0             44         0              
  0
> MiscStage                         0         0              0         0              
  0
> CompactionExecutor                3         3            395         0              
  0
> MemtableReclaimMemory             0         0             30         0              
  0
> PendingRangeCalculator            1         2             29         0              
  0
> GossipStage                       1      5602            164         0              
  0
> MigrationStage                    0         0              0         0              
  0
> MemtablePostFlush                 0         0            111         0              
  0
> ValidationExecutor                0         0              0         0              
  0
> Sampler                           0         0              0         0              
  0
> MemtableFlushWriter               0         0             30         0              
  0
> InternalResponseStage             0         0              0         0              
  0
> AntiEntropyStage                  0         0              0         0              
  0
> CacheCleanupExecutor              0         0              0         0              
  0
> Message type           Dropped
> READ                         0
> RANGE_SLICE                  0
> _TRACE                       0
> MUTATION                     0
> COUNTER_MUTATION             0
> REQUEST_RESPONSE             0
> PAGED_RANGE                  0
> READ_REPAIR                  0
> {noformat}
> A full thread dump is attached, but the relevant bit seems to be here:
> {noformat}
> [ ... ]
> "GossipStage:1" #1801 daemon prio=5 os_prio=0 tid=0x00007fe4cd54b000 nid=0xea9 waiting
on condition [0x00007fddcf883000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x00000004c1e922c0> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> 	at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
> 	at org.apache.cassandra.locator.TokenMetadata.updateNormalTokens(TokenMetadata.java:174)
> 	at org.apache.cassandra.locator.TokenMetadata.updateNormalTokens(TokenMetadata.java:160)
> 	at org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:2023)
> 	at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1682)
> 	at org.apache.cassandra.gms.Gossiper.doOnChangeNotifications(Gossiper.java:1182)
> 	at org.apache.cassandra.gms.Gossiper.applyNewStates(Gossiper.java:1165)
> 	at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1128)
> 	at org.apache.cassandra.gms.GossipDigestAckVerbHandler.doVerb(GossipDigestAckVerbHandler.java:58)
> 	at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> [ ... ]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message