Return-Path: X-Original-To: apmail-httpd-dev-archive@www.apache.org Delivered-To: apmail-httpd-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 25BCB105FA for ; Fri, 25 Jul 2014 17:31:48 +0000 (UTC) Received: (qmail 59945 invoked by uid 500); 25 Jul 2014 17:31:47 -0000 Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 59878 invoked by uid 500); 25 Jul 2014 17:31:47 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 59868 invoked by uid 99); 25 Jul 2014 17:31:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jul 2014 17:31:47 +0000 X-ASF-Spam-Status: No, hits=2.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of chaosed0@gmail.com designates 209.85.218.51 as permitted sender) Received: from [209.85.218.51] (HELO mail-oi0-f51.google.com) (209.85.218.51) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jul 2014 17:31:43 +0000 Received: by mail-oi0-f51.google.com with SMTP id g201so3597782oib.38 for ; Fri, 25 Jul 2014 10:31:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=y96ZojGqIm3sp+HrRqYfcA0/j6E1LakWYuSZsA5yauw=; b=uHiXhRiYE1J655hZ0WMmnLSvK0U+ss6d47wU9AlAp2szbu8RD7MbMwOD4oaVqHoQ4g /ymxzLzC2nicKy5N4QgG/nw87MCE8uNs2dtB1EFPMtzOpq6JkMU5rYcnK5/ZFJ9NgFft toey/VlE9I/vsiTKQLYvzwO6GuIsLDlZYSGGT8HwVFWjARDDDHr6pG/IV/Tt/PbpJijn cTAzEdW3YTo6GejRvubJGpjdDRckVQvLg49WqCUFCrwAqzWjgjMb5IQMJfzxFOfut9QC HCVQ2aYZtNXV9JjuSyg/6dreY7l08mKl4CvavQ6ZYsACvHRD5+oiAS7zx7gCicKuuuBK rSew== MIME-Version: 1.0 X-Received: by 10.60.134.136 with SMTP id pk8mr24477079oeb.81.1406309482336; Fri, 25 Jul 2014 10:31:22 -0700 (PDT) Received: by 10.182.79.161 with HTTP; Fri, 25 Jul 2014 10:31:22 -0700 (PDT) In-Reply-To: <53D27CE0.3050004@apache.org> References: <56C2E06F97EACF4FB2F1B31EEFC4493091FA8B62@mbx2.wgti.net> <56C2E06F97EACF4FB2F1B31EEFC4493091FA9160@mbx2.wgti.net> <56C2E06F97EACF4FB2F1B31EEFC4493091FA9A51@mbx2.wgti.net> <56C2E06F97EACF4FB2F1B31EEFC4493091FA9C6B@mbx2.wgti.net> <56C2E06F97EACF4FB2F1B31EEFC4493091FAC70B@mbx2.wgti.net> <53D27CE0.3050004@apache.org> Date: Fri, 25 Jul 2014 13:31:22 -0400 Message-ID: Subject: Re: Question about async mod_proxy_wstunnel and threads From: Edward Lu To: dev@httpd.apache.org Content-Type: multipart/alternative; boundary=047d7b471c2e23987904ff07f1a0 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b471c2e23987904ff07f1a0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Re: Patch; yes, I think that works as well, but I'm still not sure it makes sense. If two or more sockets are readable, the first one that's processed will remove itself and the other ones from the pollset; the rest will try to remove themselves from the pollset, but nothing will happen (since they've already been removed by the processing of the first one). On Fri, Jul 25, 2014 at 11:50 AM, Ruediger Pluem wrote: > Does the following patch work as well? > > Index: event.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- event.c (revision 1613354) > +++ event.c (working copy) > @@ -1973,9 +1973,12 @@ > apr_pollset_remove(event_pollset, baton->pfds[i]= ); > } > > + apr_pollset_remove(event_pollset, out_pfd); > push_timer2worker(te); > } > - apr_pollset_remove(event_pollset, out_pfd); > + else { > + apr_pollset_remove(event_pollset, out_pfd); > + } > } > out_pfd++; > num--; > > Regards > > R=C3=BCdiger > > Edward Lu wrote: > > I think I've found the issue; there's an extraneous apr_pollset_remove(= ) > call in event.c:1978. Relevant code > > (double-slash comments are mine): > > > > /* We only signal once per N sockets with this baton */ > > if (!(baton->signaled)) { > > baton->signaled =3D 1; > > te =3D event_get_timer_event(-1 /* fake timer */, > > baton->cbfunc, > > baton->user_baton, > > 0, /* don't insert it */ > > NULL /* no associated socket callbac= k > */); > > /* remove other sockets in my set */ > > // says remove "other" sockets, but it really removes all > > for (i =3D 0; i < baton->nsock; i++) { > > rc =3D apr_pollset_remove(event_pollset, baton->pfds[i]); > > } > > > > push_timer2worker(te); > > } > > // This causes the problem; if the sockets are processed and re-add= ed > > // before this is called, the backend socket gets re-removed, neve= r > to > > // be seen again (until the client sends something) > > apr_pollset_remove(event_pollset, out_pfd); > > > > Seems like the solution is just to remove the extra > apr_pollset_remove(); all sockets are removed from the pollset when > > one of them is signalled. If another one was signalled in the same poll= , > it will enter this block and simply do nothing > > (because of the baton->signaled check). I gave it a quick test and it > looked like the problem was solved. > > > > > > On Tue, Jul 22, 2014 at 10:46 AM, Edward Lu chaosed0@gmail.com>> wrote: > > > > I got it to break with these python scripts, as well as with my own > test client and server (browser javascript and > > node.js). Strange though, I can't reproduce it with only 2 clients > on either of those programs; I have to get 3 > > clients connected at once. Past that, I see exactly the behavior yo= u > describe. > > > > I agree that it seems to be some sort of timing issue - I've been > trying to put trace statements into the HTTPD to > > see what information I can get, but once I do that it starts > behaving as expected. Still investigating... > > > > > > On Tue, Jul 22, 2014 at 10:40 AM, Steve Zweep < > Steve.Zweep@watchguard.com > wrote: > > > > Another tidbit of information ... > > > > I attempted to see what was going on by running httpd in the > foreground with strace. When I did that, the tests > > worked perfectly. I guess this suggests it is some sort of > timing issue. Running httpd in the foreground without > > trace exhibited the problem. > > > > Here's the output of strace during the test. I sent the string > "hhhhhhh" from client 1 and it was echoed back by > > the server to both clients, as expected. > > > > [pid 15451] <... futex resumed> ) =3D 0 > > [pid 15451] futex(0x891ec54, FUTEX_WAKE_PRIVATE, 1) =3D 0 > > [pid 15451] epoll_wait(13, {{EPOLLIN, {u32=3D144017752, > u64=3D144017752}}}, 2, 0) =3D 1 > > [pid 15451] read(12, "\201\10hhhhhhhh", 8000) =3D 10 > > [pid 15451] mprotect(0xb6a3a000, 8192, PROT_READ|PROT_WRITE) = =3D 0 > > [pid 15451] read(12, 0xb6a39aa8, 8000) =3D -1 EAGAIN (Resource > temporarily unavailable) > > [pid 15451] writev(11, [{"\201\10hhhhhhhh", 10}], 1) =3D 10 > > [pid 15451] read(12, 0xa96084b0, 8000) =3D -1 EAGAIN (Resource > temporarily unavailable) > > [pid 15451] epoll_wait(13, {}, 2, 0) =3D 0 > > [pid 15451] epoll_ctl(7, EPOLL_CTL_ADD, 11, {EPOLLIN, > {u32=3D144044472, u64=3D144044472}}) =3D 0 > > [pid 15451] epoll_ctl(7, EPOLL_CTL_ADD, 12, {EPOLLIN, > {u32=3D144044496, u64=3D144044496}}) =3D 0 > > [pid 15451] gettimeofday({1406038173, 745644}, NULL) =3D 0 > > [pid 15451] epoll_wait(10, {{EPOLLIN, {u32=3D2843761856, > u64=3D2843761856}}}, 2, 0) =3D 1 > > [pid 15451] read(9, "\201\10hhhhhhhh", 8000) =3D 10 > > [pid 15451] read(9, 0xb6a39aa8, 8000) =3D -1 EAGAIN (Resource > temporarily unavailable) > > [pid 15451] writev(8, [{"\201\10hhhhhhhh", 10}], 1) =3D 10 > > [pid 15451] read(9, 0xa96084b0, 8000) =3D -1 EAGAIN (Resource > temporarily unavailable) > > [pid 15451] epoll_wait(10, {}, 2, 0) =3D 0 > > [pid 15451] epoll_ctl(7, EPOLL_CTL_ADD, 8, {EPOLLIN, > {u32=3D2843796920, u64=3D2843796920}}) =3D 0 > > [pid 15451] epoll_ctl(7, EPOLL_CTL_ADD, 9, {EPOLLIN, > {u32=3D2843796944, u64=3D2843796944}}) =3D 0 > > [pid 15451] gettimeofday({1406038173, 745832}, NULL) =3D 0 > > [pid 15451] futex(0x891ec88, FUTEX_WAIT_PRIVATE, 38, NULL > > > > > > > - Steve > > > > -----Original Message----- > > From: Steve Zweep > > Sent: Monday, July 21, 2014 11:56 AM > > To: 'dev@httpd.apache.org ' > > Subject: RE: Question about async mod_proxy_wstunnel and thread= s > > > > My client and server are based on sample code included with the > ws4py package > > (http://ws4py.readthedocs.org/en/latest/). We use CherryPy and > the ws4py package contains a CherryPy-based > > server (attached). > > > > The client is a variation of the echo_client.py sample code als= o > included in the ws4py package. > > > > In terms of failure rate, I'd say it is about 90%. Often I star= t > 2 clients and initially it works OK, but then > > stops after a message or two. Once it has failed, I can send an= y > number of messages from the one client and > > continue to see the server echo them back, but no message is > received by the second. When I then send a message > > from the second client I see a flood of messages from the serve= r > that had been queued. > > > > - Steve > > > > > > -----Original Message----- > > From: Eric Covener [mailto:covener@gmail.com covener@gmail.com>] > > Sent: Saturday, July 19, 2014 8:16 AM > > To: Apache HTTP Server Development List > > Subject: Re: Question about async mod_proxy_wstunnel and thread= s > > > > I guess it is kind of telling that we could still respond to th= e > write from client 2, there are probably more > > potential errors that make us lose track of both sides > simultaneously. > > > > Can you share your ws client and server if they're easy to use? > My little sample doesn't spray that way. > > > > I'm also curious approximately how often it fails this way. > > > > On Fri, Jul 18, 2014 at 5:50 PM, Yann Ylavic < > ylavic.dev@gmail.com > wrote: > > > Probably a false trail, trunk at r1611741 is up to date, I > think 1.5.x > > > won't help. > > > > > > On Fri, Jul 18, 2014 at 11:35 PM, Steve Zweep > > > > > wrote: > > >> Hi Yann, > > >> > > >> The test I ran today was built with APR from trunk > (r1611741). I see that r1605769 modifies apr_skiplist.c > > and there has been a lot of activity in the trunk version of > that code lately. I can try with the 1.5.x branch > > code to see if it makes a difference. Probably won't get to thi= s > before Monday though. > > >> > > >> - Steve > > >> > > >> -----Original Message----- > > >> From: Yann Ylavic [mailto:ylavic.dev@gmail.com ylavic.dev@gmail.com>] > > >> Sent: Friday, July 18, 2014 4:51 PM > > >> To: httpd > > >> Subject: Re: Question about async mod_proxy_wstunnel and > threads > > >> > > >> Hi Steve, > > >> > > >> can you still reproduce with the latest APR 1.5.x, notably > containing this fix: http://svn.apache.org/r1605769. > > >> I don't think there is a released version with this patch... > > >> > > >> Regards, > > >> Yann. > > >> > > >> On Fri, Jul 18, 2014 at 9:38 PM, Steve Zweep < > Steve.Zweep@watchguard.com > > > wrote: > > >>> I've attached annotated logs that show the issues I > described. Both scenarios have ProxyWebsocketAsync > > turned on. The first does not use the AsyncDelay and shows how > server messages stall and are not delivered until > > the client polls. The second has ProxyWebsocketAsyncDelay set t= o > 100. In that case, message processing works > > properly, but threads are held open and there is no sign of > async processing. > > >>> > > >>> Since my build and execution environment were somewhat > non-standard, I repeated all the tests today on stock > > Ubuntu 14.04, with a fresh checkout and build of httpd and apr > trunk code. The same results were observed. > > >>> > > >>> - Steve > > >>> > > >>> > > >>> > > >>> -----Original Message----- > > >>> ________________________________________ > > >>> From: Eric Covener [covener@gmail.com covener@gmail.com>] > > >>> Sent: July 17, 2014 9:15 PM > > >>> To: Apache HTTP Server Development List > > >>> Subject: Re: Question about async mod_proxy_wstunnel and > threads > > >>> > > >>> I am having trouble seeing it mis-behave. w/ Async and > AsyncDelay, I > > >>> am seeing the expected trace messages and when I look at > backtraces > > >>> of httpd I can see zero threads in wstunnel . If I send a > server > > >>> msg, I get it ASAP in the client -- and then I see 1 thread > in poll > > >>> for the right couple of seconds > > >>> > > >>> Can you grab trace at e.g. > > >>> > > >>> LogLevel INFO proxy_wstunnel_module:trace8 > > >>> > > >>> And annotate the timing a bit for what you do in the client= ? > Is it possible you have an un-updated trunk > > from several weeks ago? There was an optimization put in and > backed out that might have broke some of these > > same things for a very short window. > > >>> > > > > > > > > -- > > Eric Covener > > covener@gmail.com > > > > > > > --047d7b471c2e23987904ff07f1a0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Re: Patch; yes, I think that works as well, but I'm st= ill not sure it makes sense. If two or more sockets are readable, the first= one that's processed will remove itself and the other ones from the po= llset; the rest will try to remove themselves from the pollset, but nothing= will happen (since they've already been removed by the processing of t= he first one).


On Fri,= Jul 25, 2014 at 11:50 AM, Ruediger Pluem <rpluem@apache.org> wrote:
Does the following patch work as well?

Index: event.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- event.c =C2=A0 =C2=A0 (revision 1613354)
+++ event.c =C2=A0 =C2=A0 (working copy)
@@ -1973,9 +1973,12 @@
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0apr_pollset_remove(event_pollset, baton->= pfds[i]);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0}

+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0apr_pollset_remove(event_pollset, out_pfd);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0push_timer2worker(te);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
- =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0apr_pollset_remove= (event_pollset, out_pfd);
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0else {
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0apr_= pollset_remove(event_pollset, out_pfd);
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0out_pfd++;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0num--;

Regards

R=C3=BCdiger

Edward Lu wrote:
> I think I've found the issue; there's an extraneous apr_pollse= t_remove() call in event.c:1978. Relevant code
> (double-slash comments are mine):
>
> =C2=A0 =C2=A0 /* We only signal once per N sockets with this baton */<= br> > =C2=A0 =C2=A0 if (!(baton->signaled)) {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 baton->signaled =3D 1;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 te =3D event_get_timer_event(-1 /* fake ti= mer */,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0baton->cbfunc, > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0baton->user_baton= ,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00, /* don't inse= rt it */
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0NULL /* no associate= d socket callback */);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* remove other sockets in my set */
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 // says remove "other" sockets, = but it really removes all
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 for (i =3D 0; i < baton->nsock; i++)= {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 rc =3D apr_pollset_remove(ev= ent_pollset, baton->pfds[i]);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 }
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 push_timer2worker(te);
> =C2=A0 =C2=A0 }
> =C2=A0 =C2=A0 // This causes the problem; if the sockets are processed= and re-added
> =C2=A0 =C2=A0 // =C2=A0before this is called, the backend socket gets = re-removed, never to
> =C2=A0 =C2=A0 // =C2=A0be seen again (until the client sends something= )
> =C2=A0 =C2=A0 apr_pollset_remove(event_pollset, out_pfd);
>
> Seems like the solution is just to remove the extra apr_pollset_remove= (); all sockets are removed from the pollset when
> one of them is signalled. If another one was signalled in the same pol= l, it will enter this block and simply do nothing
> (because of the baton->signaled check). I gave it a quick test and = it looked like the problem was solved.
>
>
> On Tue, Jul 22, 2014 at 10:46 AM, Edward Lu <= chaosed0@gmail.com <mailto:chaosed0@gmail.com>> wrote: >
> =C2=A0 =C2=A0 I got it to break with these python scripts, as well as = with my own test client and server (browser javascript and
> =C2=A0 =C2=A0 node.js). Strange though, I can't reproduce it with = only 2 clients on either of those programs; I have to get 3
> =C2=A0 =C2=A0 clients connected at once. Past that, I see exactly the = behavior you describe.
>
> =C2=A0 =C2=A0 I agree that it seems to be some sort of timing issue - = I've been trying to put trace statements into the HTTPD to
> =C2=A0 =C2=A0 see what information I can get, but once I do that it st= arts behaving as expected. Still investigating...
>
>
> =C2=A0 =C2=A0 On Tue, Jul 22, 2014 at 10:= 40 AM, Steve Zweep <Steve.= Zweep@watchguard.com <mailto:Steve.Zweep@watchguard.com>> wrote:
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Another tidbit of information ...
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 I attempted to see what was going on by ru= nning httpd in the foreground with strace. When I did that, the tests
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 worked perfectly. I guess this suggests it= is some sort of timing issue. Running httpd in the foreground without
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 trace exhibited the problem.
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Here's the output of strace during the= test. I sent the string "hhhhhhh" from client 1 and it was echoe= d back by
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 the server to both clients, as expected. >
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] <... futex resumed> ) = =C2=A0 =C2=A0 =C2=A0 =3D 0
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] futex(0x891ec54, FUTEX_WAKE_PR= IVATE, 1) =3D 0
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] epoll_wait(13, {{EPOLLIN, {u32= =3D144017752, u64=3D144017752}}}, 2, 0) =3D 1
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] read(12, "\201\10hhhhhhhh= ", 8000) =3D 10
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] mprotect(0xb6a3a000, 8192, PRO= T_READ|PROT_WRITE) =3D 0
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] read(12, 0xb6a39aa8, 8000) =C2= =A0=3D -1 EAGAIN (Resource temporarily unavailable)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] writev(11, [{"\201\10hhhh= hhhh", 10}], 1) =3D 10
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] read(12, 0xa96084b0, 8000) =C2= =A0=3D -1 EAGAIN (Resource temporarily unavailable)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] epoll_wait(13, {}, 2, 0) =C2= =A0 =C2=A0=3D 0
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] epoll_ctl(7, EPOLL_CTL_ADD, 11= , {EPOLLIN, {u32=3D144044472, u64=3D144044472}}) =3D 0
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] epoll_ctl(7, EPOLL_CTL_ADD, 12= , {EPOLLIN, {u32=3D144044496, u64=3D144044496}}) =3D 0
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] gettimeofday({1406038173, 7456= 44}, NULL) =3D 0
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] epoll_wait(10, {{EPOLLIN, {u32= =3D2843761856, u64=3D2843761856}}}, 2, 0) =3D 1
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] read(9, "\201\10hhhhhhhh&= quot;, 8000) =3D 10
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] read(9, 0xb6a39aa8, 8000) =C2= =A0 =3D -1 EAGAIN (Resource temporarily unavailable)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] writev(8, [{"\201\10hhhhh= hhh", 10}], 1) =3D 10
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] read(9, 0xa96084b0, 8000) =C2= =A0 =3D -1 EAGAIN (Resource temporarily unavailable)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] epoll_wait(10, {}, 2, 0) =C2= =A0 =C2=A0=3D 0
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] epoll_ctl(7, EPOLL_CTL_ADD, 8,= {EPOLLIN, {u32=3D2843796920, u64=3D2843796920}}) =3D 0
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] epoll_ctl(7, EPOLL_CTL_ADD, 9,= {EPOLLIN, {u32=3D2843796944, u64=3D2843796944}}) =3D 0
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] gettimeofday({1406038173, 7458= 32}, NULL) =3D 0
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 [pid 15451] futex(0x891ec88, FUTEX_WAIT_PR= IVATE, 38, NULL <unfinished ...>
>
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 - Steve
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 -----Original Message-----
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 From: Steve Zweep
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Sent: Monday, July 21, 2014 11:56 AM
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 To: 'dev@httpd.apache.org <mailto:dev@httpd.apache.org>'
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Subject: RE: Question about async mod_prox= y_wstunnel and threads
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 My client and server are based on sample c= ode included with the ws4py package
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 (http://ws4py.readthedocs.org/en/latest/). = We use CherryPy and the ws4py package contains a CherryPy-based
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 server (attached).
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 The client is a variation of the echo_clie= nt.py sample code also included in the ws4py package.
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 In terms of failure rate, I'd say it i= s about 90%. Often I start 2 clients and initially it works OK, but then > =C2=A0 =C2=A0 =C2=A0 =C2=A0 stops after a message or two. Once it has = failed, I can send any number of messages from the one client and
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 continue to see the server echo them back,= but no message is received by the second. When I then send a message
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 from the second client I see a flood of me= ssages from the server that had been queued.
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 - Steve
>
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 -----Original Message-----
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 From: Eric Covener [= mailto:covener@gmail.com <mailt= o:covener@gmail.com>]
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Sent: Saturday, July 19, 2014 8:16 AM
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 To: Apache HTTP Server Development List > =C2=A0 =C2=A0 =C2=A0 =C2=A0 Subject: Re: Question about async mod_prox= y_wstunnel and threads
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 I guess it is kind of telling that we coul= d still respond to the write from client 2, there are probably more
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 potential errors that make us lose track o= f both sides simultaneously.
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Can you share your ws client and server if= they're easy to use? My little sample doesn't spray that way.
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 I'm also curious approximately how oft= en it fails this way.
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 On Fri, Jul 18, 2014= at 5:50 PM, Yann Ylavic <ylavic= .dev@gmail.com <mailto:ylavi= c.dev@gmail.com>> wrote:
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 > Probably a false trail, trunk at r161= 1741 is up to date, I think 1.5.x
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 > won't help.
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 > On Fri, Jul 18, 2014 at 11:35 PM, Ste= ve Zweep
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> From: Yann = Ylavic [mailto:ylavic.dev@gmail.com= <mailto:ylavic.dev@gmail.co= m>]
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> Sent: Friday, July 18, 2014 4:51 = PM
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> To: httpd
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> Subject: Re: Question about async= mod_proxy_wstunnel and threads
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> Hi Steve,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> can you still reproduce with the = latest APR 1.5.x, notably containing this fix: http://svn.apache.org/r1605769.
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> I don't think there is a rele= ased version with this patch...
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> Regards,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> Yann.
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> On Fri, Jul 18, 2014 at 9:3= 8 PM, Steve Zweep <Steve.Z= weep@watchguard.com <mailto:Steve.Zweep@watchguard.com>>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 wrote:
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> I've attached annotated l= ogs that show the issues I described. Both scenarios have ProxyWebsocketAsy= nc
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 turned on. The first does not use the Asyn= cDelay and shows how server messages stall and are not delivered until
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 the client polls. The second has ProxyWebs= ocketAsyncDelay set to 100. In that case, message processing works
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 properly, but threads are held open and th= ere is no sign of async processing.
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> Since my build and execution = environment were somewhat non-standard, I repeated all the tests today on s= tock
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Ubuntu 14.04, with a fresh checkout and bu= ild of httpd and apr trunk code. The same results were observed.
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> - Steve
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> -----Original Message----- > =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> _____________________________= ___________
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> From: Eric Covener [covener@gmail.com <mailto:covener@gmail.com>]
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> Sent: July 17= , 2014 9:15 PM
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> To: Apache HTTP Server Develo= pment List
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> Subject: Re: Question about a= sync mod_proxy_wstunnel and threads
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> I am having trouble seeing it= mis-behave. w/ Async and AsyncDelay, I
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> am seeing the expected trace = messages and when I look at backtraces
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> of httpd I can see zero threa= ds in wstunnel . If I send a server
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> msg, I get it ASAP in the cli= ent -- and then I see 1 thread in poll
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> for the right couple of secon= ds
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> Can you grab trace at e.g. > =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> LogLevel INFO proxy_wstunnel_= module:trace8
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>> And annotate the timing a bit= for what you do in the client? =C2=A0Is it possible you have an un-updated= trunk
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 from several weeks ago? =C2=A0There was an= optimization put in and backed out that might have broke some of these
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 same things for a very short window.
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >>>
>
>
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 --
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Eric Covener
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 covener@gmail.com <mailto:cove= ner@gmail.com>
>
>
>

--047d7b471c2e23987904ff07f1a0--