Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id E2065200B33 for ; Wed, 15 Jun 2016 06:14:11 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id E0B4E160A56; Wed, 15 Jun 2016 04:14:11 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 33B6A160A06 for ; Wed, 15 Jun 2016 06:14:11 +0200 (CEST) Received: (qmail 14457 invoked by uid 500); 15 Jun 2016 04:14:10 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 13676 invoked by uid 99); 15 Jun 2016 04:14:09 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Jun 2016 04:14:09 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 6C2152C1F61 for ; Wed, 15 Jun 2016 04:14:09 +0000 (UTC) Date: Wed, 15 Jun 2016 04:14:09 +0000 (UTC) From: "Patrick Hunt (JIRA)" To: dev@zookeeper.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Assigned] (ZOOKEEPER-2443) Invalid Memory Access (SEGFAULT) and undefined behaviour in c client MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 15 Jun 2016 04:14:12 -0000 [ https://issues.apache.org/jira/browse/ZOOKEEPER-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-2443: --------------------------------------- Assignee: Chris Nauroth [~cnauroth], [~fpj], [~rgs] can you take a look at this? According to Paul the issue was introduced in ZOOKEEPER-1029. > Invalid Memory Access (SEGFAULT) and undefined behaviour in c client > -------------------------------------------------------------------- > > Key: ZOOKEEPER-2443 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2443 > Project: ZooKeeper > Issue Type: Bug > Components: c client > Reporter: Paul Asmuth > Assignee: Chris Nauroth > > Hey, > I encountered some issues with the zookeepeer c client. > The problem starts in the zookeeper_init_internal method. A lot of initialization work is performed here and if any of the initialization routines fails, the code jumps to the "abort" label to perform various cleanup tasks [1]. The conceptual issue is that a bunch of the cleanup code tries to take locks on the zk structure that are only intialized in adaptor_init in line 1181 (at the very end of the zookeeper_init_internal method) [2]. So if we fail before reaching adaptor_init this causes trouble. > One specific instance of an invalid memory access that this causes is in free_completions [3]. Here, in line 1651 zoo_lock_auth will fail because it tries to grab an invalid mutex, after which the a_list struct is uninitialized (the linked list next pointer points to random memory) and subsequently the free routine segfaults. > An easy way to trigger this bug-path is to pass an invalid hostname, or do anything else that causes the zookeeper_init_internal method to fail before adaptor_init. > In my local checkout/codebase, I have added correct initialization for the a_list struct in the free_completions routine, which at least fixes the segfault for now. However this still leaves the issue that the cleanup code tries to grab a lot of invalid locks, which all fail. I think in order to fix this properly, one would need to do a larger refactoring of the code (add another adaptor_preinit routine to the adaptor interface maybe?) and I wasn't sure if that would be appreciated, so I didn't attach a patch for now. If someone wants me to try and clean this up, I would be happy to give it a try. > PS: I think this bug was introduced in SVN #1719528, which - as it seems - tried to work around the uninitialized locks problem by adding an int return code to all the lock_xxx functions, allowing them to indicate a failure. The change introduce the invalid memory access since some (always required) init code is only run after the lock was obtained successfully. > However, I think there is a much large issue with the change and I think it must be reverted. Trying to lock an uninitialized mutex is undefined behaviour on POSIX and may lead to deadlocks, etc. > >> If mutex does not refer to an initialized mutex object, the behavior of pthread_mutex_lock(), pthread_mutex_trylock(), and pthread_mutex_unlock() is undefined. > http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_lock.html > [1] https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1078 > [2] https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1181 > [3] https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1651 > ------------------ > BACKTRACE > Program received signal SIGSEGV, Segmentation fault. > 0x000000010004f6d5 in free_auth_completion (a_list=0x7fff5fbff048) at /deps/3rdparty/zookeeper/source/src/zookeeper.c:260 > 260 tmp = tmp->next; > #0 0x000000010004f6d5 in free_auth_completion (a_list=0x7fff5fbff048) at /deps/3rdparty/zookeeper/source/src/zookeeper.c:260 > #1 0x000000010004f500 in free_completions (zh=0x1003022f0, callCompletion=1, reason=-116) at /deps/3rdparty/zookeeper/source/src/zookeeper.c:1219 > #2 0x0000000100057bfd in cleanup_bufs (zh=0x1003022f0, callCompletion=1, rc=-116) at /deps/3rdparty/zookeeper/source/src/zookeeper.c:1227 > #3 0x000000010004ee42 in destroy (zh=0x1003022f0) at /deps/3rdparty/zookeeper/source/src/zookeeper.c:393 > #4 0x000000010004eaf3 in zookeeper_init (host=0x1006005b0 "xxxinvalidhostname:2181", watcher=0x100007670 , > recv_timeout=10000, clientid=0x0, context=0x100600350, flags=0) at /deps/3rdparty/zookeeper/source/src/zookeeper.c:877 -- This message was sent by Atlassian JIRA (v6.3.4#6332)