ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "radhakrupa (Jira)" <j...@apache.org>
Subject [jira] [Created] (IGNITE-12086) Ignite pod keeps crashing and failed to recover the node
Date Tue, 20 Aug 2019 08:23:00 GMT
radhakrupa created IGNITE-12086:

             Summary: Ignite pod keeps crashing and failed to recover the node 
                 Key: IGNITE-12086
                 URL: https://issues.apache.org/jira/browse/IGNITE-12086
             Project: Ignite
          Issue Type: Bug
    Affects Versions: 2.7
            Reporter: radhakrupa
         Attachments: hs_err_pid116.log, ignite-config.xml

Ignite has been deployed on the kubernets , there are 3 replicas of server pod. The pods were
up and running fine for 9 days.  We have created 180 invent tables and 204 transactional
tables. The data has been inserted using the PyIgnite client using the cache.put() method. 
This is a very slow operation because PyIgnite is very slow.  Each insert is committed one
at a time, so it is not able to do bulk-style inserts. The PyIgnite was inserting about 20
of the inventory tables simultaneously (20 different threads/processes).

The cluster was nowhere stable after 9days, one of the pod crashed and failed to recover.
Below is the error log:

to process custom exchange task: ClientCacheChangeDummyDiscoveryMessage [reqId=6b5f6c50-a8c9-4b04-a461-49bfd0112eb0,
cachesToClose=null, startCaches=[BgwService]] java.lang.NullPointerException| at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesChanges(CacheAffinitySharedManager.java:635)|
at org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:391)|
at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2475)|
at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2620)|
at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2539)|
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)| at java.lang.Thread.run(Thread.java:748)"}
node stopped in the middle of checkpoint. Will restore memory state and finish checkpoint
on node start."}

The error report file and ignite-config.xml has been attached for your info.

Heap Memory and RAM Configurations are as below on each of the ignite server container:

Heap Memory: 32gb


Default memory region: 

cpu: 4

Persistence volume

wal_storage_size: 10GB

persistence_storage_size: 10GB


This message was sent by Atlassian Jira

View raw message