ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Karachentsev <dkarachent...@gridgain.com>
Subject Grid hang on compute
Date Wed, 07 Dec 2016 13:35:41 GMT

Recently faced with arguable issue, it looks like a bug. Scenario is 

1) Start two data nodes with some cache.

2) From one node in async mode post some big number of jobs to another. 
That jobs do some cache operations.

3) Grid hangs almost immediately and all threads are sleeping except 
public ones, they are waiting for response.

This happens because all cache and job messages are queued on 
communication and limited with default number (1024). It looks like jobs 
are waiting for cache responses that could not be received due to this 
limit. It's hard to diagnose and looks not convenient (as I know we have 
no limitation in docs for using cache ops from compute jobs).

So, my question is. Should we try to solve that or, may be, it's enough 
to update documentation with recommendation to disable queue limit for 
such cases?

View raw message