ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrey Gura (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (IGNITE-8797) Error during writeCheckpointEntry is not passed to failure handler during checkpoint finish
Date Thu, 05 Jul 2018 10:01:00 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-8797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrey Gura updated IGNITE-8797:
--------------------------------
    Fix Version/s: 2.7

> Error during writeCheckpointEntry is not passed to failure handler during checkpoint
finish
> -------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-8797
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8797
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Alexey Goncharuk
>            Assignee: Aleksey Plekhanov
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain
>             Fix For: 2.7
>
>
> I observed the following failure in Cache 3 suite:
> {code}
> [13:10:55]W:		 [org.apache.ignite:ignite-core] [2018-06-14 10:10:55,509][ERROR][db-checkpoint-thread-#138910%paged.PageEvictionMultinodeMixedRegionsTest2%][GridCacheDatabaseSharedManager]
Failed to create checkpoint.
> [13:10:55]W:		 [org.apache.ignite:ignite-core] class org.apache.ignite.internal.processors.cache.persistence.file.PersistentStorageIOException:
Failed to write checkpoint entry [ptr=FileWALPointer [idx=0, fileOff=219747, len=1947], cpTs=1528971054548,
cpId=d8b42759-ca5e-4613-b091-ed0356b3915d, type=END]
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.writeCheckpointEntry(GridCacheDatabaseSharedManager.java:2757)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.access$8100(GridCacheDatabaseSharedManager.java:178)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointEnd(GridCacheDatabaseSharedManager.java:3716)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:3277)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:3053)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at java.lang.Thread.run(Thread.java:748)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] Caused by: java.nio.file.NoSuchFileException:
/data/teamcity/work/c182b70f2dfa6507/work/db/node03-c5dcc243-fc3c-4b2f-8002-81e88d8cff7d/cp/1528971054548-d8b42759-ca5e-4613-b091-ed0356b3915d-END.bin.tmp
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at java.nio.file.Files.move(Files.java:1395)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.writeCheckpointEntry(GridCacheDatabaseSharedManager.java:2752)
> [13:10:55]W:		 [org.apache.ignite:ignite-core] 	... 6 more
> [13:10:55]W:		 [org.apache.ignite:ignite-core] [2018-06-14 10:10:55,509][ERROR][db-checkpoint-thread-#138914%paged.PageEvictionMultinodeMixedRegionsTest3%][GridCacheDatabaseSharedManager]
Failed to create checkpoint.
> {code}
> I see two issues here:
> 1) Some concurrent process is removing the work folder which results in the exception
above
> 2) The checkpoint exception is not passed to the failure handler. This is due to a catch
{{// TODO-ignite-db how to handle exception?}} in {{Checkpointer}}, which yields an uncompleted
checkpoint future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message