hadoop-zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kapil Thangavelu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (ZOOKEEPER-763) Deadlock on close w/ zkpython / c client
Date Wed, 05 May 2010 19:10:02 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12864452#action_12864452
] 

Kapil Thangavelu commented on ZOOKEEPER-763:
--------------------------------------------

Hi Henry

The issue with the example i sent is that when the condition notify happens in the python
callback, the main process thread can start running before the callback has exited and the
completion thread will still be running. It could probably make be made more explicit for
reproducing by inserting a time.sleep(1) line into the callback after the notify.

This is the stack trace for the completion thread on deadlock.

#0  0x00cb8422 in __kernel_vsyscall ()
#1  0x00387245 in sem_wait@@GLIBC_2.1 () from /lib/tls/i686/cmov/libpthread.so.0
#2  0x0810abe8 in PyThread_acquire_lock ()
#3  0x080dcc11 in PyEval_EvalFrameEx ()
#4  0x080e2807 in PyEval_EvalCodeEx ()
#5  0x080e0c8b in PyEval_EvalFrameEx ()
#6  0x080e1bb0 in PyEval_EvalFrameEx ()
#7  0x080e2807 in PyEval_EvalCodeEx ()
#8  0x0816b2ac in ?? ()
#9  0x0806245a in PyObject_Call ()
#10 0x080db892 in PyEval_CallObjectWithKeywords ()
#11 0x080624f0 in PyObject_CallObject ()
#12 0x00a8dd95 in data_completion_dispatch (rc=0, value=0xa01dcd8 "8\334\001\n\370\263N",
value_len=0, stat=0xb773828c, data=0xa0035d0) at src/c/zookeeper.c:410
#13 0x00f33ecc in process_completions (zh=0xa024ab0) at src/zookeeper.c:1778
#14 0x00f4005b in do_completion (v=0xa024ab0) at src/mt_adaptor.c:333
#15 0x0038096e in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#16 0x00461a0e in clone () from /lib/tls/i686/cmov/libc.so.6

thanks,
Kapil


> Deadlock on close w/ zkpython / c client
> ----------------------------------------
>
>                 Key: ZOOKEEPER-763
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-763
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client, contrib-bindings
>    Affects Versions: 3.3.0
>         Environment: ubuntu 10.04, zookeeper 3.3.0 and trunk
>            Reporter: Kapil Thangavelu
>            Assignee: Mahadev konar
>             Fix For: 3.4.0
>
>         Attachments: deadlock.py, stack-trace-deadlock.txt
>
>
> deadlocks occur if we attempt to close a handle while there are any outstanding async
requests (aget, acreate, etc). Normally on close both the io thread terminates and the completion
thread are terminated and joined, however w\ith outstanding async requests, the completion
thread won't be in a joinable state, and we effectively hang when the main thread does the
join.
> afaics ideal behavior would be on close of a handle, to effectively clear out any remaining
callbacks and let the completion thread terminate.
> i've tried adding some bookkeeping to within a python client to guard against closing
while there is an outstanding async completion request, but its an imperfect solution since
even after the python callback is executed there is still a window for deadlock before the
completion thread finishes the callback.
> a simple example to reproduce the deadlock is attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message