ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yakov Zhdanov <yzhda...@gridgain.com>
Subject Re: Futures not being triggered?
Date Thu, 30 Jul 2015 14:10:50 GMT

1. take threaddump - jstack -l PID - from each node.
2. create issue in Ignite Jira and attach threaddumpt to it
3. also attach logs from all nodes and config files.

Do you start task from job and wait synchronously for its completion? If
so, this may lead to starvation.

Yakov Zhdanov, Director R&D
*GridGain Systems*

2015-07-30 17:00 GMT+03:00 Dabbo <darren.edmonds@spgroup.co.uk>:

> Hi,
> I'm having an issue where an Ignite cluster node is completing a job, but
> not triggering the future.  I'm not sure how to detail my problem, so
> apologies if the following doesn't make any sense.
> I have a Spring web server application running a client only Apache Ignite
> instance that waits for at least 1 worker node to connect before the server
> application submits jobs to the worker node(s).  The node in my example is
> configured to run 6 parallel jobs.
> (FifoQueueCollisionSpi.parallelJobsNumber=6).
> Each job involves rendering XSL/XML document into PDF and returning that
> result result.
> Usually, after about 2,000 - 9,000 jobs have been successfully processed,
> the node just "stops" processing any more jobs with more on the server
> remaining to process on it's internal queue.  The memory looks fine and no
> errors are reported in the logs.  I've have placed logging statements on
> the
> worker node and I can verify every job submitted to the worker nodes are
> processed successfully and no communication errors are logged.
> On the client only server side; I have logging on the future listeners to
> state when they are triggered.  Every job fires the future up until the
> problem occurs and the remaining jobs fail to trigger their future and
> therefore my server application waits for the node to finish a job whilst
> the node itself waits for new jobs (empty?).
> Both the server and node(s) run on the same Ubuntu server (Virtual).  I
> have
> looked at garbage collection tuning when observing "network" time-outs
> between the server and the node.
> I've also adjusted (increased) the TcpDiscoverySPi properties (ackTimeout,
> networkTimeout, socketTimeout and heartbeatFrequency) to allow for the
> timeouts during a possible GC events.
> Truth is, the garbage collection isn't taking that long to complete - 1
> second at worst case.  There are no errors reported by ignite or my
> application to state loss of connection with the node.
> At the moment, I believe the issue is that the node/server are losing the
> future event - communication problem?  Would you have any ideas or advice
> on
> how to diagnose this problem further?
> Many thanks in advance,
> Darren.
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Futures-not-being-triggered-tp773.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.

View raw message