ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anmol Rattan <anmolrat...@gmail.com>
Subject Re: One failing node stalling the whole cluster
Date Fri, 16 Sep 2016 13:07:04 GMT
That is known error at least in 1.6. I am not sure a fix for this is even
in 1.7. For gc pause, if there are actually any, worth considering jvm
tuning and seeing allocation and promotion rate.

In our case, we had to increase younger gen to have  8GB space to deal

However, slow client definitely hang whole grid, even if there are no GC,
 A chicken egg problem results. If you increase timeout, grid hangs for
longer time.

if your reduce timeout, clients/nodes will leave grid early and even go in
segmentation and Segmentation policy handling via starting ignite bean only
works if you start process with ignite script. If prcoess has been started
otherwise in a custom script, it does not support.

Thanks & Regards
Anmol Rattan
+91 9538901262

On Fri, Sep 16, 2016 at 10:44 AM, yfernando <yohan.fernando@tudor.com>

> Hi Denis,
> We have been able to reproduce this situation where a node failure freezes
> the entire grid.
> Please find the full thread dumps of the 5 nodes that are locked up.
> The memoryMode of the caches are configured to be OFFHEAP_TIERED
> The cacheMode is PARTITIONED
> The atomicityMode is TRANSACTIONAL
> We have also seen ALL the clients freeze during a FULL GC occurring on ANY
> single node.
> Please let us know if you require any more information.
> grid-tp1-dev-11220-201609141523318.txt
> <http://apache-ignite-users.70518.x6.nabble.com/file/
> n7791/grid-tp1-dev-11220-201609141523318.txt>
> grid-tp1-dev-11223-201609141523318.txt
> <http://apache-ignite-users.70518.x6.nabble.com/file/
> n7791/grid-tp1-dev-11223-201609141523318.txt>
> grid-tp3-dev-11220-201609141523318.txt
> <http://apache-ignite-users.70518.x6.nabble.com/file/
> n7791/grid-tp3-dev-11220-201609141523318.txt>
> grid-tp3-dev-11221-201609141523318.txt
> <http://apache-ignite-users.70518.x6.nabble.com/file/
> n7791/grid-tp3-dev-11221-201609141523318.txt>
> grid-tp4-dev-11220-201609141523318.txt
> <http://apache-ignite-users.70518.x6.nabble.com/file/
> n7791/grid-tp4-dev-11220-201609141523318.txt>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/One-failing-node-stalling-the-
> whole-cluster-tp5372p7791.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

View raw message