ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vyacheslav Koptilin (Jira)" <j...@apache.org>
Subject [jira] [Updated] (IGNITE-12124) Stopping the cache does not wait for expiration process, which may be started and may lead to errors
Date Wed, 04 Sep 2019 08:49:00 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-12124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vyacheslav Koptilin updated IGNITE-12124:
-----------------------------------------
    Description: 
Stopping a cache with configured TTL may lead to errors. For instance,
{noformat}
java.lang.NullPointerException
	at org.apache.ignite.internal.processors.cache.GridCacheContext.onDeferredDelete(GridCacheContext.java:1702)
	at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.onTtlExpired(GridCacheMapEntry.java:4040)
	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:75)
	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:66)
	at org.apache.ignite.internal.util.lang.IgniteInClosure2X.apply(IgniteInClosure2X.java:37)
	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:2501)
	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:2427)
	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:989)
	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:233)
	at org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:150)
	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
	at java.lang.Thread.run(Thread.java:748){noformat}
The obvious reason for this {{NullPointerException}} is that unregistering of {{GridCacheTtlManager}}
(see {{GridCacheSharedTtlCleanupManager#unregister}} does not wait for the finish of expiration
(in that particular case, {{GridCacheContext}} is already cleaned up).

 

So, unregistering of {{GridCacheTtlManager}}, caused by cache stopping, must wait for expiration
if it is running for the cache that stops. On the other hand, it does not seem correct to
wait for expiration under the {{checkpointReadLock}} see {{GridCacheProcessor#processCacheStopRequestOnExchangeDone}}:

{code:java}
private void processCacheStopRequestOnExchangeDone(ExchangeActions exchActions) {
    ...
    doInParallel(
        parallelismLvl,
        sharedCtx.kernalContext().getSystemExecutorService(),
        cachesToStop.entrySet(),
        cachesToStopByGrp -> {
            ...
            for (ExchangeActions.CacheActionData action : cachesToStopByGrp.getValue()) {
                ...
                sharedCtx.database().checkpointReadLock();

                try {
                    prepareCacheStop(action.request().cacheName(), action.request().destroy());
<---unregistering of GridCacheTtlManager is performed here
                }
                finally {
                    sharedCtx.database().checkpointReadUnlock();
                }
            }
            ...
    }
}
{code}


  was:
Stopping a cache with configured TTL may lead to errors. For instance,
{noformat}
java.lang.NullPointerException
	at org.apache.ignite.internal.processors.cache.GridCacheContext.onDeferredDelete(GridCacheContext.java:1702)
	at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.onTtlExpired(GridCacheMapEntry.java:4040)
	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:75)
	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:66)
	at org.apache.ignite.internal.util.lang.IgniteInClosure2X.apply(IgniteInClosure2X.java:37)
	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:2501)
	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:2427)
	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:989)
	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:233)
	at org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:150)
	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
	at java.lang.Thread.run(Thread.java:748){noformat}
The obvious reason for this {{NullPointerException}} is that unregistering of {{GridCacheTtlManager}}
(see {{GridCacheSharedTtlCleanupManager#unregister}} does not wait for the finish of expiration
(in that particular case, {{GridCacheContext}} is already cleaned up).

 

So, unregistering of {{GridCacheTtlManager}}, caused by cache stopping, must wait for expiration
if it is running for the cache that stops. On the other hand, it does not seem correct to
wait for expiration under the {{checkpointReadLock}} see {{GridCacheProcessor#processCacheStopRequestOnExchangeDone}}:

{code:java}
    private void processCacheStopRequestOnExchangeDone(ExchangeActions exchActions) {
        ...
        try {
            doInParallel(
                    parallelismLvl,
                    sharedCtx.kernalContext().getSystemExecutorService(),
                    cachesToStop.entrySet(),
                    cachesToStopByGrp -> {
                            ...
                            for (ExchangeActions.CacheActionData action: cachesToStopByGrp.getValue())
{
                                ...
                                sharedCtx.database().checkpointReadLock();

                                try {
                                    prepareCacheStop(action.request().cacheName(), action.request().destroy());
<--- unregistering of GridCacheTtlManager is performed here
                                }
                                finally {
                                    sharedCtx.database().checkpointReadUnlock();
                                }
                            }
        ...
    }
{code}



> Stopping the cache does not wait for expiration process, which may be started and may
lead to errors
> ----------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-12124
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12124
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.7
>            Reporter: Vyacheslav Koptilin
>            Assignee: Vyacheslav Koptilin
>            Priority: Major
>             Fix For: 2.8
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Stopping a cache with configured TTL may lead to errors. For instance,
> {noformat}
> java.lang.NullPointerException
> 	at org.apache.ignite.internal.processors.cache.GridCacheContext.onDeferredDelete(GridCacheContext.java:1702)
> 	at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.onTtlExpired(GridCacheMapEntry.java:4040)
> 	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:75)
> 	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:66)
> 	at org.apache.ignite.internal.util.lang.IgniteInClosure2X.apply(IgniteInClosure2X.java:37)
> 	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:2501)
> 	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:2427)
> 	at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:989)
> 	at org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:233)
> 	at org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:150)
> 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
> 	at java.lang.Thread.run(Thread.java:748){noformat}
> The obvious reason for this {{NullPointerException}} is that unregistering of {{GridCacheTtlManager}}
(see {{GridCacheSharedTtlCleanupManager#unregister}} does not wait for the finish of expiration
(in that particular case, {{GridCacheContext}} is already cleaned up).
>  
> So, unregistering of {{GridCacheTtlManager}}, caused by cache stopping, must wait for
expiration if it is running for the cache that stops. On the other hand, it does not seem
correct to wait for expiration under the {{checkpointReadLock}} see {{GridCacheProcessor#processCacheStopRequestOnExchangeDone}}:
> {code:java}
> private void processCacheStopRequestOnExchangeDone(ExchangeActions exchActions) {
>     ...
>     doInParallel(
>         parallelismLvl,
>         sharedCtx.kernalContext().getSystemExecutorService(),
>         cachesToStop.entrySet(),
>         cachesToStopByGrp -> {
>             ...
>             for (ExchangeActions.CacheActionData action : cachesToStopByGrp.getValue())
{
>                 ...
>                 sharedCtx.database().checkpointReadLock();
>                 try {
>                     prepareCacheStop(action.request().cacheName(), action.request().destroy());
<---unregistering of GridCacheTtlManager is performed here
>                 }
>                 finally {
>                     sharedCtx.database().checkpointReadUnlock();
>                 }
>             }
>             ...
>     }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Mime
View raw message