zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahadev Konar <maha...@yahoo-inc.com>
Subject Re: avoiding deadlocks on client handle close w/ python/c api
Date Tue, 04 May 2010 21:41:00 GMT
Sure, Ill take a look at it.


On 5/4/10 2:32 PM, "Patrick Hunt" <phunt@apache.org> wrote:

> Thanks Kapil, Mahadev perhaps you could take a look at this as well?
> Patrick
> On 05/04/2010 06:36 AM, Kapil Thangavelu wrote:
>> I've constructed  a simple example just using the zkpython library with
>> condition variables, that will deadlock. I've filed a new ticket for it,
>> https://issues.apache.org/jira/browse/ZOOKEEPER-763
>> the gdb stack traces look suspiciously like the ones in 591, but sans the
>> watchers.
>> https://issues.apache.org/jira/browse/ZOOKEEPER-591
>> the attached example on the ticket will deadlock in zk 3.3.0 (which has the
>> fix for 591) and trunk.
>> -kapil
>> On Mon, May 3, 2010 at 9:48 PM, Kapil Thangavelu<kapil.foss@gmail.com>wrote:
>>> Hi Folks,
>>> I'm constructing an async api on top of the zookeeper python bindings for
>>> twisted. The intent was to make a thin wrapper that would wrap the existing
>>> async api with one that allows for integration with the twisted python event
>>> loop (http://www.twistedmatrix.com) primarily using the async apis.
>>> One issue i'm running into while developing a unit tests, deadlocks occur
>>> if we attempt to close a handle while there are any outstanding async
>>> requests (aget, acreate, etc). Normally on close both the io thread
>>> terminates and the completion thread are terminated and joined, however
>>> w\ith outstanding async requests, the completion thread won't be in a
>>> joinable state, and we effectively hang when the main thread does the join.
>>> I'm curious if this would be considered bug, afaics ideal behavior would be
>>> on close of a handle, to effectively clear out any remaining callbacks and
>>> let the completion thread terminate.
>>> i've tried adding some bookkeeping to the api to guard against closing
>>> while there is an outstanding completion request, but its an imperfect
>>> solution do to the nature of the event loop integration. The problem is that
>>> the python callback invoked by the completion thread in turn schedules a
>>> function for the main thread. In twisted the api for this is implemented by
>>> appending the function to a list attribute on the reactor and then writing a
>>> byte to a pipe to wakeup the main thread. If a thread switch to the main
>>> thread occurs before the completion thread callback returns, the scheduled
>>> function runs and the rest of the application keeps processing, of which the
>>> last step for the unit tests is to close the connection, which results in a
>>> deadlock.
>>> i've included some of the client log and gdb stack traces from a deadlock'd
>>> client process.
>>> thanks,
>>> Kapil

View raw message