hadoop-zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Solomon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (ZOOKEEPER-740) zkpython leading to segfault on zookeeper
Date Sat, 05 Jun 2010 02:53:53 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875842#action_12875842
] 

Mike Solomon commented on ZOOKEEPER-740:
----------------------------------------

The common cases is when you supply a watcher for a get().

Start a zk server on localhost and create a node /zk.

Connect a python process like this:

import sys
import zookeeper

zh = zookeeper.init('localhost:2181')

def _zk_callback(*args):
  print >> sys.stderr, "_zk_callback",  args

zookeeper.get(zh, '/zk', _zk_callback)

Kill the zk server. The client will idle fine. When restarting the zk server, I get a SIGSEGV
on reconnect 100% of the time.

This is fixed by the following patch:

[msolomon]yuriko:~/src/zookeeper-3.3.1/src/contrib/zkpython> svn di
Index: src/c/zookeeper.c
===================================================================
--- src/c/zookeeper.c	(revision 951628)
+++ src/c/zookeeper.c	(working copy)
@@ -436,7 +436,9 @@
   if (PyObject_CallObject((PyObject*)callback, arglist) == NULL) {
     PyErr_Print();
   }
-  if (pyw->permanent == 0) {
+  // msolomon: when a session event happens, watchers get dispatched,
+  // but they are retained in the C client for dispatch again.
+  if (pyw->permanent == 0 && type != ZOO_SESSION_EVENT) {
     free_pywatcher(pyw);
   }
   PyGILState_Release(gstate);




> zkpython leading to segfault on zookeeper
> -----------------------------------------
>
>                 Key: ZOOKEEPER-740
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-740
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.3.0
>            Reporter: Federico
>            Assignee: Henry Robinson
>            Priority: Critical
>             Fix For: 3.4.0
>
>
> The program that we are implementing uses the python binding for zookeeper but sometimes
it crash with segfault; here is the bt from gdb:
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0xad244b70 (LWP 28216)]
> 0x080611d5 in PyObject_Call (func=0x862fab0, arg=0x8837194, kw=0x0)
>     at ../Objects/abstract.c:2488
> 2488    ../Objects/abstract.c: No such file or directory.
>         in ../Objects/abstract.c
> (gdb) bt
> #0  0x080611d5 in PyObject_Call (func=0x862fab0, arg=0x8837194, kw=0x0)
>     at ../Objects/abstract.c:2488
> #1  0x080d6ef2 in PyEval_CallObjectWithKeywords (func=0x862fab0,
>     arg=0x8837194, kw=0x0) at ../Python/ceval.c:3575
> #2  0x080612a0 in PyObject_CallObject (o=0x862fab0, a=0x8837194)
>     at ../Objects/abstract.c:2480
> #3  0x0047af42 in watcher_dispatch (zzh=0x86174e0, type=-1, state=1,
>     path=0x86337c8 "", context=0x8588660) at src/c/zookeeper.c:314
> #4  0x00496559 in do_foreach_watcher (zh=0x86174e0, type=-1, state=1,
>     path=0x86337c8 "", list=0xa5354140) at src/zk_hashtable.c:275
> #5  deliverWatchers (zh=0x86174e0, type=-1, state=1, path=0x86337c8 "",
>     list=0xa5354140) at src/zk_hashtable.c:317
> #6  0x0048ae3c in process_completions (zh=0x86174e0) at src/zookeeper.c:1766
> #7  0x0049706b in do_completion (v=0x86174e0) at src/mt_adaptor.c:333
> #8  0x0013380e in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
> #9  0x002578de in clone () from /lib/tls/i686/cmov/libc.so.6

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message