Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 54FDB200C07 for ; Sun, 22 Jan 2017 17:17:26 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 5381E160B45; Sun, 22 Jan 2017 16:17:26 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 4EB75160B38 for ; Sun, 22 Jan 2017 17:17:25 +0100 (CET) Received: (qmail 40498 invoked by uid 500); 22 Jan 2017 16:17:19 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 40488 invoked by uid 99); 22 Jan 2017 16:17:19 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 22 Jan 2017 16:17:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id EF39BC0115 for ; Sun, 22 Jan 2017 16:17:18 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -3.1 X-Spam-Level: X-Spam-Status: No, score=-3.1 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RP_MATCHES_RCVD=-2.999, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=greenbytes.de header.b=s4Kz3RHS; dkim=pass (1024-bit key) header.d=greenbytes.de header.b=B0kqo3Oj Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id FVXZe37R3f5Q for ; Sun, 22 Jan 2017 16:17:16 +0000 (UTC) Received: from mail.greenbytes.de (mail.greenbytes.de [5.10.171.186]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 9D1AD5F22F for ; Sun, 22 Jan 2017 16:17:15 +0000 (UTC) Received: by mail.greenbytes.de (Postfix, from userid 117) id 93E0415A38E6; Sun, 22 Jan 2017 17:17:04 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=greenbytes.de; s=mail; t=1485101824; bh=i0Nbf/KeiAfnpqxYjsK1NyesK/FU0is8v5JFebu3m4k=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=s4Kz3RHS7Wx0+7vGtiDoJeQC8aBw5mAAV16j2zRCqSnm+jeRhkLy3HRspqNziwWbx uxKcqk6Pk8TKYUwRFFdh4x5t1eXusbOty/sb0QtSfm+ml7icIvzJmeJQkbeuCXSFvv /IQxFO5ftn86yRiwpgvS7+QNSsgeRaKjFLuptszA= Received: from [192.168.178.72] (unknown [93.211.120.200]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail.greenbytes.de (Postfix) with ESMTPSA id 835A215A38E2; Sun, 22 Jan 2017 17:17:03 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=greenbytes.de; s=mail; t=1485101823; bh=i0Nbf/KeiAfnpqxYjsK1NyesK/FU0is8v5JFebu3m4k=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=B0kqo3Ojsh0Vxo8qUQitDmv66MtYP3kB7esRFaNYv5ZI6e2a67Yv0wXvxOs1CJXOD yeL9X10+4c3izrULQOXpd6KTh7BPEQ3VErQmyIOZKSIK5fBbtczPdxIZrNYYJ5bDxd MHRfmhsoe/CMZ+rMfvz05zUzErZXV3hM14jfS5H4= Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Subject: Re: mod_http2 and Frequent wake-ups for mpm_event From: Stefan Eissing In-Reply-To: <94698d9e-bad4-bd20-cfac-bf4b945203b7@profihost.ag> Date: Sun, 22 Jan 2017 17:17:02 +0100 Cc: Yann Ylavic Content-Transfer-Encoding: quoted-printable Message-Id: <8823CE4D-1357-4939-8FD7-108167BA75E9@greenbytes.de> References: <3214CF17-8921-4347-B801-0093BA29C280@profihost.ag> <58893576-8D52-43AD-8AB1-31E5D70CDA9B@greenbytes.de> <8eca89a8-03ae-108f-d3df-5cc85911811e@profihost.ag> <73c1b14f-18f6-af45-c8ea-3584d6652d7a@profihost.ag> <21cd5f8f-a2c7-8aba-ccfb-5f3c399edebc@profihost.ag> <289b946c-9114-0eac-e470-67c0de909fd5@profihost.ag> <86e616ec-271c-3576-d4a1-08143e04976e@profihost.ag> <02B166CD-4D82-4F87-94AB-F4603723BDBA@greenbytes.de> <50cfac51-1c47-b2c1-4348-bc0b21dfcbac@profihost.ag> <1d8d1ff9-a3ca-4488-4cde-e093b3942db7@profihost.ag> <94698d9e-bad4-bd20-cfac-bf4b945203b7@profihost.ag> To: dev@httpd.apache.org X-Mailer: Apple Mail (2.3259) archived-at: Sun, 22 Jan 2017 16:17:26 -0000 > Am 22.01.2017 um 17:14 schrieb Stefan Priebe - Profihost AG = : >=20 > *arg* it's just mod_proxy - just saw thread safety and apr bucket = aloc. ??? Can you elaborate? Is your finding the known hcheck bug or something = else? > Stefan >=20 > Am 22.01.2017 um 17:06 schrieb Stefan Priebe - Profihost AG: >> Looks like others have the same crashes too: >> https://bz.apache.org/bugzilla/show_bug.cgi?id=3D60071 >> and >> = https://github.com/apache/httpd/commit/8e63c3c9372cd398f57357099aa941cbba6= 95758 >>=20 >> So it looks like mod_http2 is running fine now. Thanks a lot Stefan. >>=20 >> Yann i think i can start testing your mpm patch again after the >> segfaults in 2.4 branch are fixed. >>=20 >> Greets, >> Stefan >>=20 >> Am 22.01.2017 um 13:16 schrieb Stefan Priebe: >>> Hi, >>>=20 >>> and a new one but also in ap_start_lingering_close: >>>=20 >>> Program terminated with signal SIGSEGV, Segmentation fault. >>> #0 apr_palloc (pool=3Dpool@entry=3D0x7f455805e138, = in_size=3Din_size@entry=3D32) >>> at memory/unix/apr_pools.c:684 >>> #0 apr_palloc (pool=3Dpool@entry=3D0x7f455805e138, = in_size=3Din_size@entry=3D32) >>> at memory/unix/apr_pools.c:684 >>> #1 0x00007f456bc5d8b4 in apr_brigade_create (p=3D0x7f455805e138, >>> list=3D0x7f45040034e8) at buckets/apr_brigade.c:61 >>> #2 0x000055e165efa319 in ap_shutdown_conn = (c=3Dc@entry=3D0x7f455805e458, >>> flush=3Dflush@entry=3D1) at connection.c:76 >>> #3 0x000055e165efa40d in ap_flush_conn (c=3D0x7f455805e458) at >>> connection.c:95 >>> #4 ap_start_lingering_close (c=3D0x7f455805e458) at = connection.c:145 >>> #5 0x000055e165f942dd in start_lingering_close_blocking = (cs=3D>> out>) >>> at event.c:876 >>> #6 process_socket (my_thread_num=3D, >>> my_child_num=3D, cs=3D0x7f455805e3c8, = sock=3D, >>> p=3D, thd=3D) at event.c:1153 >>> #7 worker_thread (thd=3D0x7f455805e138, dummy=3D0x20) at = event.c:2001 >>> #8 0x00007f456b80a0a4 in start_thread () >>> from /lib/x86_64-linux-gnu/libpthread.so.0 >>> #9 0x00007f456b53f62d in clone () from = /lib/x86_64-linux-gnu/libc.so.6 >>>=20 >>> Stefan >>>=20 >>> Am 21.01.2017 um 19:31 schrieb Stefan Priebe: >>>> All last traces come from event, proces_longering_close = ap_push_pool but >>>> end in different functions. It looks like a race somewhere and it = just >>>> races at different function in the event of close and pool clear. >>>>=20 >>>> Might there be two places where the same pool gets cleared? >>>>=20 >>>> Stefan >>>>=20 >>>> Am 21.01.2017 um 19:07 schrieb Stefan Priebe: >>>>> Hi Stefan, >>>>>=20 >>>>> thanks. No crashes where h2 comes up. But i still have these and = no idea >>>>> how to find out who and why they're crashing. >>>>>=20 >>>>> Using host libthread_db library >>>>> "/lib/x86_64-linux-gnu/libthread_db.so.1". >>>>> Core was generated by `/usr/local/apache2/bin/httpd -k start'. >>>>> Program terminated with signal SIGSEGV, Segmentation fault. >>>>> #0 allocator_free (node=3D0x0, allocator=3D0x7f6e08066540) >>>>> at memory/unix/apr_pools.c:381 >>>>> #0 allocator_free (node=3D0x0, allocator=3D0x7f6e08066540) >>>>> at memory/unix/apr_pools.c:381 >>>>> #1 apr_pool_clear (pool=3D0x7f6e0808d238) at = memory/unix/apr_pools.c:793 >>>>> #2 0x00000000004fe528 in ap_push_pool (queue_info=3D0x0, >>>>> pool_to_recycle=3D0x7f6e08066548) at fdqueue.c:234 >>>>> #3 0x00000000004fa2c8 in process_lingering_close = (cs=3D0x7f6e0808d4c8, >>>>> pfd=3D0x1d3bf98) at event.c:1439 >>>>> #4 0x00000000004fd410 in listener_thread (thd=3D0x1d3cb70, >>>>> dummy=3D0x7f6e0808d4c8) >>>>> at event.c:1704 >>>>> #5 0x00007f6e1aed20a4 in start_thread () >>>>> from /lib/x86_64-linux-gnu/libpthread.so.0 >>>>> #6 0x00007f6e1aa0362d in clone () from = /lib/x86_64-linux-gnu/libc.so.6 >>>>> (gdb) (gdb) quit >>>>>=20 >>>>> Reading symbols from /usr/local/apache/bin/httpd...Reading symbols = from >>>>> /usr/lib/debug//usr/local/apache2/bin/httpd...done. >>>>> done. >>>>> [Thread debugging using libthread_db enabled] >>>>> Using host libthread_db library >>>>> "/lib/x86_64-linux-gnu/libthread_db.so.1". >>>>> Core was generated by `/usr/local/apache2/bin/httpd -k start'. >>>>> Program terminated with signal SIGSEGV, Segmentation fault. >>>>> #0 allocator_free (node=3D0x0, allocator=3D0x7f6e08053ae0) >>>>> at memory/unix/apr_pools.c:381 >>>>> #0 allocator_free (node=3D0x0, allocator=3D0x7f6e08053ae0) >>>>> at memory/unix/apr_pools.c:381 >>>>> #1 apr_pool_clear (pool=3D0x7f6e08076bb8) at = memory/unix/apr_pools.c:793 >>>>> #2 0x00000000004fe528 in ap_push_pool (queue_info=3D0x0, >>>>> pool_to_recycle=3D0x7f6e08053ae8) at fdqueue.c:234 >>>>> #3 0x00000000004fa2c8 in process_lingering_close = (cs=3D0x7f6e08076e48, >>>>> pfd=3D0x1d3bf98) at event.c:1439 >>>>> #4 0x00000000004fd410 in listener_thread (thd=3D0x1d3cb70, >>>>> dummy=3D0x7f6e08076e48) >>>>> at event.c:1704 >>>>> #5 0x00007f6e1aed20a4 in start_thread () >>>>> from /lib/x86_64-linux-gnu/libpthread.so.0 >>>>> #6 0x00007f6e1aa0362d in clone () from = /lib/x86_64-linux-gnu/libc.so.6 >>>>> (gdb) (gdb) quit >>>>>=20 >>>>> Stefan >>>>>=20 >>>>> Am 21.01.2017 um 17:03 schrieb Stefan Eissing: >>>>>> Stefan, >>>>>>=20 >>>>>> made a release at = https://github.com/icing/mod_h2/releases/tag/v1.8.9 >>>>>> with all patches and improved (hopefully) on them a bit. If you = dare >>>>>> to drop that into your installation, that'd be great. >>>>>>=20 >>>>>> Cheers, >>>>>>=20 >>>>>> Stefan >>>>>>=20 >>>>>>> Am 21.01.2017 um 15:25 schrieb Stefan Priebe = : >>>>>>>=20 >>>>>>> and i got another crash here: >>>>>>>=20 >>>>>>> 2346 static void run_cleanups(cleanup_t **cref) >>>>>>> 2347 { >>>>>>> 2348 cleanup_t *c =3D *cref; >>>>>>> 2349 >>>>>>> 2350 while (c) { >>>>>>> 2351 *cref =3D c->next; >>>>>>> 2352 (*c->plain_cleanup_fn)((void *)c->data); <=3D=3D = here >>>>>>> 2353 c =3D *cref; >>>>>>> 2354 >>>>>>>=20 >>>>>>> which looks similar to the other crash. >>>>>>>=20 >>>>>>> #0 0x00007fe4bbd33e1b in run_cleanups (cref=3D) = at >>>>>>> memory/unix/apr_pools.c:2352 >>>>>>> #1 apr_pool_clear (pool=3D0x7fe4a804dac8) at >>>>>>> memory/unix/apr_pools.c:772 >>>>>>> #2 0x00000000004feb38 in ap_push_pool >>>>>>> (queue_info=3D0x6d616e79642d3733, pool_to_recycle=3D0x2) at = fdqueue.c:234 >>>>>>> #3 0x00000000004fa8d8 in process_lingering_close = (cs=3D0x7fe4a804dd58, >>>>>>> pfd=3D0x25d3f98) at event.c:1439 >>>>>>>=20 >>>>>>> Details: >>>>>>> (gdb) print c >>>>>>> $1 =3D (cleanup_t *) 0x7fe4a804e9f0 >>>>>>> (gdb) print *c >>>>>>> $2 =3D {next =3D 0x7fe4a804e870, data =3D 0x6d616e79642d3733, >>>>>>> plain_cleanup_fn =3D 0x392d3734322e6369, >>>>>>> child_cleanup_fn =3D 0x617465722e722d35} >>>>>>> (gdb) print *c->data >>>>>>> Attempt to dereference a generic pointer. >>>>>>> (gdb) print *c->plain_cleanup_fn >>>>>>> Cannot access memory at address 0x392d3734322e6369 >>>>>>> (gdb) >>>>>>>=20 >>>>>>> Stefan >>>>>>>=20 >>>>>>> Am 21.01.2017 um 15:18 schrieb Stefan Priebe: >>>>>>>> Hi, >>>>>>>>=20 >>>>>>>> #0 apr_pool_cleanup_kill (p=3D0x7fe4a8072358, >>>>>>>> data=3Ddata@entry=3D0x7fe4a80723e0, >>>>>>>> cleanup_fn=3Dcleanup_fn@entry=3D0x7fe4bbd38a40 = ) at >>>>>>>> memory/unix/apr_pools.c:2276 >>>>>>>>=20 >>>>>>>> it crashes here in apr: >>>>>>>> 2276 if (c->data =3D=3D data && c->plain_cleanup_fn =3D=3D= >>>>>>>> cleanup_fn) { >>>>>>>>=20 >>>>>>>> some lines before c becomes this >>>>>>>> 2264 c =3D p->cleanups; >>>>>>>>=20 >>>>>>>> p is: >>>>>>>> (gdb) print *p >>>>>>>> $1 =3D {parent =3D 0x256f138, child =3D 0x7fe46c0751c8, sibling = =3D >>>>>>>> 0x7fe4a8096888, ref =3D 0x7fe4a8069fe8, cleanups =3D = 0x7fe478159748, >>>>>>>> free_cleanups =3D 0x7fe478159788, allocator =3D 0x7fe4a803b490, >>>>>>>> subprocesses =3D 0x0, abort_fn =3D 0x43da00 , >>>>>>>> user_data =3D 0x0, tag =3D 0x502285 "transaction", active =3D >>>>>>>> 0x7fe478158d70, self =3D 0x7fe4a8072330, >>>>>>>> self_first_avail =3D 0x7fe4a80723d0 "X#\a\250\344\177", = pre_cleanups =3D >>>>>>>> 0x7fe4a8072ab8} >>>>>>>>=20 >>>>>>>> wouldn't the error mean that p->cleanups is NULL? >>>>>>>>=20 >>>>>>>> (gdb) print *p->cleanups >>>>>>>> $2 =3D {next =3D 0x7fe478159628, data =3D 0x7fe478159648, >>>>>>>> plain_cleanup_fn =3D >>>>>>>> 0x7fe4bbd2ffd0 , >>>>>>>> child_cleanup_fn =3D 0x7fe4bbd2ff70 = } >>>>>>>>=20 >>>>>>>> So p->cleanups->data is 0x7fe478159648 and data is = 0x7fe4a80723e0? >>>>>>>>=20 >>>>>>>> I don't get why it's segfaulting >>>>>>>>=20 >>>>>>>> Stefan >>>>>>>> Am 21.01.2017 um 09:50 schrieb Yann Ylavic: >>>>>>>>> Hi Stefan, >>>>>>>>>=20 >>>>>>>>> On Sat, Jan 21, 2017 at 9:45 AM, Stefan Priebe >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>>>=20 >>>>>>>>>> after running the whole night. These are the only ones still >>>>>>>>>> happening. >>>>>>>>>> Should i revert the mpm patch to check whether it's the = source? >>>>>>>>>=20 >>>>>>>>> Yes please, we need to determine... >>>>>>>>>=20 >>>>>>>>> Thanks, >>>>>>>>> Yann. >>>>>>>>>=20 >>>>>>=20 >>>>>> Stefan Eissing >>>>>>=20 >>>>>> bytes GmbH >>>>>> Hafenstrasse 16 >>>>>> 48155 M=C3=BCnster >>>>>> www.greenbytes.de >>>>>>=20 Stefan Eissing bytes GmbH Hafenstrasse 16 48155 M=C3=BCnster www.greenbytes.de