ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Mashenkov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-8295) Possible deadlock on partition eviction.
Date Wed, 18 Apr 2018 11:05:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16442297#comment-16442297
] 

Andrew Mashenkov commented on IGNITE-8295:
------------------------------------------

After wrap partStoreLock into checkpointLock i've got next stacktrace.
Seems, we should truncate partition file under checkpointLock.

java.lang.AssertionError: FullPageId [pageId=0001005700000003, effectivePageId=0000005700000003,
grpId=2141373874]
 at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730)
 at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:624)
 at org.apache.ignite.internal.processors.cache.persistence.DataStructure.acquirePage(DataStructure.java:142)
 at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:301)
 at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:186)
 at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.onCheckpointBegin(GridCacheOffheapManager.java:164)
 at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:3155)
 at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2909)
 at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2808)
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
 at java.lang.Thread.run(Thread.java:748)

> Possible deadlock on partition eviction.
> ----------------------------------------
>
>                 Key: IGNITE-8295
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8295
>             Project: Ignite
>          Issue Type: Bug
>          Components: persistence
>            Reporter: Andrew Mashenkov
>            Assignee: Andrew Mashenkov
>            Priority: Major
>             Fix For: 2.6
>
>         Attachments: deadlock.stack
>
>
> GridCacheOffheapManager.recreateCacheDataStore() calls updatePartitionCounter() under
partStoreLock which may try to acquire checkpointReadLock.
> recreateCacheDataStore() method can be called with checkpointReadLock (on GridDhtPartitionsExchangeFuture.updatePartitionFullMap)

> or without checkpointReadLock (GridDhtPartitionEvictor thread calls evictPartitionAsync),
> So, checkpoint can cause a deadlock if it happens in between.
> Seems, we should acquire checkpointReadLock before partStoreLock. 
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message