ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John <fatmanc...@gmail.com>
Subject Ignite Deadlock
Date Sun, 15 May 2016 12:59:42 GMT
Hi.

I have 2 ignite instances that use IgniteCache to store some cache values.
The cache is configured with replication on, so both instances have the
same data.

Since I am running JNI code to get the cache values, it sometimes (on rare
occasions) crashes, which in turn kills the ignite instance. I have an
external script that starts the failed ignite instance as soon as it
crashes.

I was expecting the non crashed ignite instance (ignite1) to quickly update
the crashed instance (ignite2) and both to continue working as usual.

This was exactly what was going on for a few days, until one time, ignite2
has crashed, and ignite1 seems to get into a deadlock. As soon as ignite2
got back up, it failed to recognize ignite1, and failed to replicate from
it. Any client connections to ignite instances stopped working as well.

I am seeing this error in the log:

Failed to wait for initial partition map exchange. Possible reasons are:
  ^-- Transactions in deadlock.
  ^-- Long running transactions (ignore if this is the case).
  ^-- Unreleased explicit locks.

and also:

Local node has detected failed nodes and started cluster-wide procedure. To
speed up failure detection please see 'Failure Detection' section under
javadoc for 'TcpDiscoverySpi'


I am using ignite v1.4
Any suggestions or ideas will be highly appreciated.

Thanks!

Mime
View raw message