ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yakov Zhdanov <yzhda...@apache.org>
Subject Communication exception handling
Date Sat, 28 Nov 2015 12:37:54 GMT
Guys,

I see the following code
(org/apache/ignite/internal/processors/cache/distributed/dht/GridDhtTxPrepareFuture.java:1129):

                    try {
                        cctx.io().send(n, req, tx.ioPolicy());
                    }
                    catch (ClusterTopologyCheckedException e) {
                        fut.onNodeLeft(e);
                    }
                    catch (IgniteCheckedException e) {
                        if (!cctx.kernalContext().isStopping())
                            fut.onResult(e);
                    }


Which means that in case if node has just started stop procedure, all cache
operations may potentially hang. If cache.put() is called from job and node
is stopping gracefully, stop process hangs with 100% probability.

This issue does not threaten failure detection and nodes crash cases since
this is handled by separate logic.

I fixed Communication SPI to use its internal stopping flag instead of the
system wide one and this seems to fix the issue with graceful stop.

Semyon, can you please see if this may cause any other issue of the kind?

My changes are here - https://github.com/apache/ignite/pull/278

--Yakov

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message