ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Magda <dma...@gridgain.com>
Subject Re: Fixed deadlock in GridDhtAtomicCache (Alex G. your review is needed)
Date Mon, 10 Aug 2015 12:16:32 GMT
Andrey Gura,

Could you put the info on the errors you observed with cache read 
operations in the ticket below?


On 8/10/2015 3:13 PM, Denis Magda wrote:
> What do you mean under the cleanup on a higher level?
> Do you consider setting all cache context references to null when 
> required letting a garbage collector to deallocate context's internals 
> when it's time for that?
> In any case I've created a ticker where we can put all the useful 
> thoughts/ideas that should help an implementor.
> https://issues.apache.org/jira/browse/IGNITE-1221
> -- 
> Denis
> On 8/5/2015 5:48 PM, Yakov Zhdanov wrote:
>> Guys, what about not invalidating cache contexts on stop? Let's 
>> cleanup on
>> higher level.
>> --Yakov
>> 2015-08-04 22:48 GMT+03:00 Denis Magda <dmagda@gridgain.com>:
>>> Alex, thanks for the review!
>>> Sure, this is just a local fix.
>>> Recently I've detected and fixed several issues in TCP communication 
>>> SPI
>>> that happened because of invalidated cache context. In addition, Andrey
>>> Gura mentioned that periodically he reproduces hangs in cache get
>>> operations that most likely to happen because of invalidated cache 
>>> context
>>> as well.
>>> Seems that it's time to fix the situation with invalidated cache 
>>> context
>>> globally. I'll create a task in JIRA in several days when return from a
>>> short vacation putting extensive details. Then someone from the 
>>> community
>>> or me will have a chance to makes his/her hands dirty with this :)
>>> As for this deadlock I'll merge that changes in any case because we 
>>> need to
>>> have them in the code to omit other RuntimeExceptions that may happen
>>> because of any other reason. The threads that led to the deadlock were
>>> threads from partitions supply pool or some internal workers pool.
>>> Regards,
>>> Denis
>>> On 4 авг. 2015 г., at 22:09, Alexey Goncharuk 
>>> <alexey.goncharuk@gmail.com>
>>> wrote:
>>> The change by itself looks right and can be merged, however I do not 
>>> think
>>> this is a complete fix. What kind of running threads were using 
>>> invalidated
>>> cache context? These threads may raise plenty of other exceptions if
>>> invalid context is used. I think the proper solution should block a 
>>> guard
>>> (I am sure we already have a guard that we can reuse) and wait for all
>>> threads to release this guard before cleaning up the context.
>>> 2015-08-04 8:28 GMT-07:00 Denis Magda <dmagda@gridgain.com>:
>>> Hi Alex, Igniters,
>>> I've fixed a deadlock in GridDhtAtomicCache that was a reason of 
>>> frequent
>>> hanging of "Cache Restart" test suite.
>>> In short, the deadlock happened because a cache was already stopped but
>>> some running threads, that perform cache related operations, keep using
>>> invalidated GridCacheContext.
>>> All the details are described here:
>>> https://issues.apache.org/jira/browse/IGNITE-1189 <
>>> https://issues.apache.org/jira/browse/IGNITE-1189>
>>> Alex, as one of earlier implementers of this code, please review the
>>> changes.
>>> Regards,
>>> Denis

View raw message