zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahadev Konar <devko...@gmail.com>
Subject Re: hang in zookeeper_close() in the mt C client
Date Mon, 07 Feb 2011 05:38:22 GMT
great. thanks michi!



On Tue, Feb 1, 2011 at 1:33 PM, Michi Mutsuzaki <michim@yahoo-inc.com> wrote:
> I'll take a look and see if I can reproduce it.
>
> --Michi
>
> On 2/1/11 1:24 PM, "Patrick Hunt" <phunt@apache.org> wrote:
>
>> Great. I marked it as critical given the client hangs. If someone has
>> a chance to look at what this might be (mahadev or michi?) it would be
>> great to get a fix in.
>>
>> Patrick
>>
>> On Tue, Feb 1, 2011 at 12:23 PM, Jeremy Stribling <strib@nicira.com> wrote:
>>> Ok, done:
>>>
>>> https://issues.apache.org/jira/browse/ZOOKEEPER-981
>>>
>>> On 02/01/2011 10:00 AM, Patrick Hunt wrote:
>>>>
>>>> Hi Jeremy. Nothing comes to mind. I searched around on jira a bit and
>>>> nothing there pops out at me either.
>>>>
>>>> I'd encourage you to create a jira regardless, add the details you
>>>> have available currently and if you are able to reproduce attach
>>>> additional information.
>>>>
>>>> Regards,
>>>>
>>>> Patrick
>>>>
>>>> On Mon, Jan 31, 2011 at 6:09 PM, Jeremy Stribling<strib@nicira.com>
>>>>  wrote:
>>>>
>>>>>
>>>>> I haven't been able to reproduce it, but if I do I will update again
with
>>>>> more details and make a JIRA.  I was hoping someone might just know
>>>>> something off the top of their head.  Thanks!
>>>>>
>>>>> On 01/31/2011 05:05 PM, Patrick Hunt wrote:
>>>>>
>>>>>>
>>>>>> Hi Jeremy, that is unusual, is it reproduceable? Do you have details
>>>>>> on the stack for threads other than this thread doing the close?
>>>>>>
>>>>>> It would be best if you could create a JIRA for this, the more detail
>>>>>> you could provide the better (full stacks and any log files)
>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER
>>>>>>
>>>>>> Patrick
>>>>>>
>>>>>> On Mon, Jan 31, 2011 at 10:26 AM, Jeremy Stribling<strib@nicira.com>
>>>>>>  wrote:
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> I responded to someone off-list about this, but I just wanted
to
>>>>>>> clarify
>>>>>>> to
>>>>>>> everyone that the part of the backtrace that isn't shown is entirely
>>>>>>> within
>>>>>>> my application, and zookeeper_close isn't being called from any
>>>>>>> Zookeeper
>>>>>>> completion thread.
>>>>>>>
>>>>>>> On 01/29/2011 11:23 AM, Jeremy Stribling wrote:
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Hi everyone,
>>>>>>>>
>>>>>>>> I use the multithreaded ZK C client library (3.3.2), and
I'm seeing my
>>>>>>>> application hang, and the only thread in it that's doing
anything
>>>>>>>> interesting is this one:
>>>>>>>>
>>>>>>>> Thread 8 (Thread 5644):
>>>>>>>> #0  0x00007f5d7bb5bbe4 in __lll_lock_wait () from /lib/libpthread.so.0
>>>>>>>> #1  0x00007f5d7bb59ad0 in pthread_cond_broadcast@@GLIBC_2.3.2
() from
>>>>>>>> /lib/libpthread.so.0
>>>>>>>> #2  0x00007f5d793628f6 in unlock_completion_list (l=0x32b4d68)
at
>>>>>>>> .../zookeeper/src/c/src/mt_adaptor.c:66
>>>>>>>> #3  0x00007f5d79354d4b in free_completions (zh=0x32b4c80,
>>>>>>>> callCompletion=1, reason=-116) at
>>>>>>>> .../zookeeper/src/c/src/zookeeper.c:1069
>>>>>>>> #4  0x00007f5d79355008 in cleanup_bufs (zh=0x32b4c80,
>>>>>>>> callCompletion=1,
>>>>>>>> rc=-116) at .../thirdparty/zookeeper/src/c/src/zookeeper.c:1125
>>>>>>>> #5  0x00007f5d79353200 in destroy (zh=0x32b4c80) at
>>>>>>>> .../thirdparty/zookeeper/src/c/src/zookeeper.c:366
>>>>>>>> #6  0x00007f5d79358e0e in zookeeper_close (zh=0x32b4c80)
at
>>>>>>>> .../zookeeper/src/c/src/zookeeper.c:2326
>>>>>>>> #7  0x00007f5d79356d18 in api_epilog (zh=0x32b4c80, rc=0)
at
>>>>>>>> .../zookeeper/src/c/src/zookeeper.c:1661
>>>>>>>> #8  0x00007f5d79362f2f in adaptor_finish (zh=0x32b4c80)
at
>>>>>>>> .../zookeeper/src/c/src/mt_adaptor.c:205
>>>>>>>> #9  0x00007f5d79358c8c in zookeeper_close (zh=0x32b4c80)
at
>>>>>>>> .../zookeeper/src/c/src/zookeeper.c:2297
>>>>>>>> ....
>>>>>>>>
>>>>>>>> I've seen some threads online about how there's a race condition
>>>>>>>> associated with zookeeper_close, where if you app is making
a
>>>>>>>> synchronous
>>>>>>>> call at the same time using the closed zk_handle, there could
be a
>>>>>>>> hang.
>>>>>>>>  However, my app makes no synchronous calls, and I'm 99%
sure that no
>>>>>>>> other
>>>>>>>> thread in my app is making any concurrent call into the library
>>>>>>>> ('thread
>>>>>>>> apply all bt' in gdb doesn't show any other usage of the
library,
>>>>>>>> anyway).
>>>>>>>>
>>>>>>>> Has anyone seen this before?  Any leads?  Thanks,
>>>>>>>>
>>>>>>>> Jeremy
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>
>
>

Mime
View raw message