httpd-bugs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject [Bug 59237] http2 breaks external auth after a few successful requests
Date Thu, 15 Jun 2017 21:52:40 GMT
https://bz.apache.org/bugzilla/show_bug.cgi?id=59237

John Morrissey <jwm@horde.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jwm@horde.net
             Status|NEEDINFO                    |NEW

--- Comment #4 from John Morrissey <jwm@horde.net> ---
I recently experienced the same symptoms as Karim (Apache 2.4.25,
mod_authnz_external 3.3.2) when enabling HTTP2: basic auth fails randomly.

This doesn't look like a timeout on the pwauth subprocess. I can run it in a
continuous loop for as long as I want. Each execution consistently takes ~0.03s
of wallclock.

[jwm@boost:pts/9 ~> while :; do echo -e 'some-user\nsome-password' | time sudo
pwauth; if [ $? -ne 0 ]; then echo $?; fi; done
0.02user 0.00system 0:00.03elapsed 87%CPU (0avgtext+0avgdata 3712maxresident)k
0inputs+0outputs (0major+574minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 93%CPU (0avgtext+0avgdata 3576maxresident)k
0inputs+0outputs (0major+574minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 80%CPU (0avgtext+0avgdata 3720maxresident)k
0inputs+0outputs (0major+575minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 90%CPU (0avgtext+0avgdata 3724maxresident)k
0inputs+0outputs (0major+577minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 93%CPU (0avgtext+0avgdata 3708maxresident)k
0inputs+0outputs (0major+576minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 90%CPU (0avgtext+0avgdata 3728maxresident)k
0inputs+0outputs (0major+578minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 90%CPU (0avgtext+0avgdata 3664maxresident)k
0inputs+0outputs (0major+571minor)pagefaults 0swaps
0.02user 0.00system 0:00.03elapsed 90%CPU (0avgtext+0avgdata 3668maxresident)k
0inputs+0outputs (0major+575minor)pagefaults 0swaps
[...]

OTOH, it looks like the Apache worker is killing the pwauth child before it's
had a chance to do anything at all:

[pid 13021] <... clone resumed> child_stack=NULL,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7f17728969d0) = 14222
[pid 13020] <... wait4 resumed> 0x7ffff3893bac, WNOHANG|WSTOPPED, NULL) = 0
strace: Process 14222 attached
[pid 13020] kill(14222, SIGTERM <unfinished ...>

The backtrace (breakpointed on apr_proc_kill()) shows:

#0  apr_proc_kill (proc=0x7fffdc6faac0, signum=signum@entry=15)
    at ./threadproc/unix/signals.c:36
#1  0x00007ffff7725e0b in free_proc_chain (procs=0x7fffdbe2f068)
    at ./memory/unix/apr_pools.c:2464
#2  0x00007ffff7726b90 in apr_pool_destroy (pool=0x7fffdbe30028)
    at ./memory/unix/apr_pools.c:817
#3  0x00007ffff3e35f0c in h2_blist_cleanup (bl=bl@entry=0x7fffec21d2a8)
    at h2_bucket_beam.c:400
#4  0x00007ffff3e35ff7 in beam_send_cleanup (data=0x7fffec21d278) at
h2_bucket_beam.c:455
#5  0x00007ffff7726b36 in run_cleanups (cref=0x7fffdbe36098)
    at ./memory/unix/apr_pools.c:2352
#6  apr_pool_destroy (pool=0x7fffdbe36028) at ./memory/unix/apr_pools.c:804
#7  0x00007ffff3e41246 in task_destroy (m=0x7fffec2252d0, task=0x7fffdbe320a0,
    called_from_master=<optimized out>) at h2_mplx.c:394
#8  0x00007ffff3e41460 in stream_done (m=m@entry=0x7fffec2252d0,
    stream=stream@entry=0x7fffec21d0a0, rst_error=0) at h2_mplx.c:460
#9  0x00007ffff3e4279d in h2_mplx_stream_done (m=0x7fffec2252d0,
    stream=stream@entry=0x7fffec21d0a0) at h2_mplx.c:654
#10 0x00007ffff3e4a057 in h2_session_stream_done (session=0x7fffec2250a0,
    stream=0x7fffec21d0a0) at h2_session.c:85
#11 0x00005555555a5ce6 in remove_empty_buckets (bb=bb@entry=0x7ffff7e088c0)
    at core_filters.c:720
#12 0x00005555555a6112 in send_brigade_nonblocking (s=s@entry=0x7ffff7e080a0,
    bb=bb@entry=0x7ffff7e088c0,
bytes_written=bytes_written@entry=0x7fffec22f318,
    c=c@entry=0x7ffff7e08290) at core_filters.c:710
#13 0x00005555555a6e41 in send_brigade_blocking (c=0x7ffff7e08290,
    bytes_written=0x7fffec22f318, bb=0x7ffff7e088c0, s=0x7ffff7e080a0)
    at core_filters.c:733
#14 ap_core_output_filter (f=0x7fffec22f148, new_bb=0x7ffff7e088c0) at
core_filters.c:542
#15 0x00007fffed751c23 in ssl_io_filter_output (f=0x7ffff7e08878,
bb=0x7fffec22f3c8)
    at ssl_engine_io.c:1716
#16 0x00007fffed74edce in ssl_io_filter_coalesce (f=0x7ffff7e08850,
bb=0x7fffec22f3c8)
    at ssl_engine_io.c:1663
#17 0x00007ffff3e3a58c in pass_output (io=io@entry=0x7fffec2250e8,
flush=flush@entry=1,
    session_eoc=session_eoc@entry=0x0) at h2_conn_io.c:296
#18 0x00007ffff3e3aa5c in h2_conn_io_flush (io=io@entry=0x7fffec2250e8)
    at h2_conn_io.c:327
#19 0x00007ffff3e4a4aa in h2_session_ev_no_io (msg=0x0, arg=0,
session=0x7fffec2250a0)
    at h2_session.c:1840
#20 dispatch_event (session=session@entry=0x7fffec2250a0,
    ev=ev@entry=H2_SESSION_EV_NO_IO, arg=arg@entry=0, msg=msg@entry=0x0)
    at h2_session.c:1993
#21 0x00007ffff3e4e125 in h2_session_process (session=0x7fffec2250a0,
    async=async@entry=0) at h2_session.c:2239
#22 0x00007ffff3e39aca in h2_conn_run (ctx=ctx@entry=0x7fffec22f2d0,
    c=c@entry=0x7ffff7e08290) at h2_conn.c:212
#23 0x00007ffff3e3f9fb in h2_h2_process_conn (c=0x7ffff7e08290) at h2_h2.c:658
#24 0x00005555555b54f0 in ap_run_process_connection (c=c@entry=0x7ffff7e08290)
    at connection.c:42
#25 0x00005555555b5a40 in ap_process_connection (c=c@entry=0x7ffff7e08290,
    csd=<optimized out>) at connection.c:226
#26 0x00007ffff2f236bf in child_main (child_num_arg=child_num_arg@entry=0,
    child_bucket=child_bucket@entry=0) at prefork.c:723
#27 0x00007ffff2f238f2 in make_child (s=0x7ffff7fbf4a0, slot=slot@entry=0,
    bucket=bucket@entry=0) at prefork.c:767
#28 0x00007ffff2f24e37 in prefork_run (_pconf=<optimized out>,
plog=0x7ffff7fbc028,
    s=0x7ffff7fbf4a0) at prefork.c:979
#29 0x000055555558edde in ap_run_mpm (pconf=0x7ffff7fed028,
plog=0x7ffff7fbc028,
    s=0x7ffff7fbf4a0) at mpm_common.c:94
#30 0x0000555555587a1d in main (argc=<optimized out>, argv=<optimized out>) at
main.c:783

(gdb) print *proc
$2 = {pid = 8950, in = 0x7fffdbe24d60, out = 0x0, err = 0x0}

PID 8950 is indeed pwauth:

root          8950  8950  0.0  0.0      0     0 pts/1 Z    -            17:31
[pwauth] <defunct>

I don't know enough about the Apache 2.4 process model to say much more. It
looks like each prefork worker now has two threads. In this case, one of the
worker's threads clone()s to prepare the child process, but it's the worker's
*other* thread that kills the child process immediately.

How can I help debug this further?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


Mime
View raw message