cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Dusbabek (JIRA)" <j...@apache.org>
Subject [jira] Updated: (CASSANDRA-778) Gossiper thread deadlock
Date Mon, 08 Feb 2010 18:49:32 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gary Dusbabek updated CASSANDRA-778:
------------------------------------

    Attachment: 0001-fix-deadlock.patch

> Gossiper thread deadlock
> ------------------------
>
>                 Key: CASSANDRA-778
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-778
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6
>            Reporter: Gary Dusbabek
>            Assignee: Gary Dusbabek
>             Fix For: 0.6
>
>         Attachments: 0001-fix-deadlock.patch
>
>
> Found this while attempting to bootstrap a node with more than a trivial amount of data:
> Found one Java-level deadlock:
> =============================
> "GMFD:1":
>   waiting to lock monitor 0x0000000100861d60 (object 0x00000001066a7ed8, a org.apache.cassandra.service.StorageService),
>   which is held by "main"
> "main":
>   waiting to lock monitor 0x0000000100860710 (object 0x0000000106c7c968, a org.apache.cassandra.gms.Gossiper),
>   which is held by "GMFD:1"
> Java stack information for the threads listed above:
> ===================================================
> "GMFD:1":
> 	at org.apache.cassandra.service.StorageService.getReplicationStrategy(StorageService.java:226)
> 	- waiting to lock <0x00000001066a7ed8> (a org.apache.cassandra.service.StorageService)
> 	at org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:634)
> 	at org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:502)
> 	at org.apache.cassandra.service.StorageService.onChange(StorageService.java:445)
> 	at org.apache.cassandra.service.StorageService.onJoin(StorageService.java:812)
> 	at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:607)
> 	at org.apache.cassandra.gms.Gossiper.handleNewJoin(Gossiper.java:582)
> 	at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:649)
> 	- locked <0x0000000106c7c968> (a org.apache.cassandra.gms.Gossiper)
> 	at org.apache.cassandra.gms.Gossiper$GossipDigestAck2VerbHandler.doVerb(Gossiper.java:1061)
> 	at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:40)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:637)
> "main":
> 	at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:861)
> 	- waiting to lock <0x0000000106c7c968> (a org.apache.cassandra.gms.Gossiper)
> 	at org.apache.cassandra.service.StorageService.startBootstrap(StorageService.java:347)
> 	at org.apache.cassandra.service.StorageService.initServer(StorageService.java:318)
> 	- locked <0x00000001066a7ed8> (a org.apache.cassandra.service.StorageService)
> 	at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:99)
> 	at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:174)
> Found 1 deadlock.
> main acquires SS lock and doesn't release it before attempting to acquire the Gossiper
lock.  Meanwhile, the gossip stage acquires the Gossiper lock and then attempts to acquire
the SS lock.
> Solution is to have finer-grained locking on the resource in SS (map of replication strategies),
or to move the collection to a different class (DD maybe?).  This was introduced in CASSANDRA-620.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message