Return-Path: X-Original-To: apmail-tomcat-dev-archive@www.apache.org Delivered-To: apmail-tomcat-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ECEF510423 for ; Mon, 30 Sep 2013 13:34:58 +0000 (UTC) Received: (qmail 89370 invoked by uid 500); 30 Sep 2013 13:34:45 -0000 Delivered-To: apmail-tomcat-dev-archive@tomcat.apache.org Received: (qmail 89199 invoked by uid 500); 30 Sep 2013 13:34:37 -0000 Mailing-List: contact dev-help@tomcat.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Tomcat Developers List" Delivered-To: mailing list dev@tomcat.apache.org Received: (qmail 89074 invoked by uid 99); 30 Sep 2013 13:34:29 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Sep 2013 13:34:29 +0000 Received: from localhost (HELO [192.168.23.9]) (127.0.0.1) (smtp-auth username markt, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Sep 2013 13:34:29 +0000 Message-ID: <52497DE2.4010007@apache.org> Date: Mon, 30 Sep 2013 14:34:26 +0100 From: Mark Thomas User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: Tomcat Developers List Subject: Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23] References: <5246199A.70000@christopherschultz.net> <5246AEB6.5050509@kippdata.de> <5246B613.6010504@apache.org> <52492924.1040205@apache.org> In-Reply-To: <52492924.1040205@apache.org> X-Enigmail-Version: 1.5.2 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit On 30/09/2013 08:32, Mark Thomas wrote: > On 28/09/2013 11:57, Mladen Turk wrote: >> On 09/28/2013 12:25 PM, Rainer Jung wrote: >>> >>>> I can't seem to reproduce the problem using an APR connector on Linux: >>>> reports seems to indicate that trivially accessing Tomcat via an HTTP >>>> APR connector will cause a crash, and I was able to run the following >>>> command without bringing down my instance: >>>> >>>> $ ab -n 10000 -c 50 http://localhost:28215/my-webapp/index.html >>> >>> >>> Still the whole poll stuff is platform dependent, so it might be you >>> will not be able to reproduce the Windows crash even when the test >>> scenario is exactly the same as there. >>> >> >> Almost every APR crash error is related to reusing closed >> object/descriptor/pointer. >> Eg, crashing inside poll can be caused by closing the socket that is >> inside poller. >> The fact that it doesn't crash on linux might be just because of "close" >> order. >> Even on the same OS it doesn't have to crash all the time. >> >> Anyhow, I'll try to check the 8-RC3 with the latest native on windows. >> Seems there are some conceptual problem used inside TC8 probably with >> async sockets and double close. > > I suspect multiple factors: > - the refactoring to a single poller thread may have introduced some > additional issues we haven't tracked down yet > - allowing two threads (one read, one write) to work with the same > socket introduces lots of opportunities to close a socket in one thread > while it is still in use in another > > I've been mulling over refactoring the connector code so all actions on > the socket go via the socket wrapper so it is easier to track true > socket state in one place. Now might be the time to implement that change. I believe I have found the root cause of the crash. It is the code I added to remove a socket from the poller when it was being closed. Since removing a socket from the poller is not thread safe, I modified the code so that the poller thread did this. The problem this introduced is that the socket may well be closed before it is removed from the poller and there is a strong possibility that poll() will be called on the closed socket. Either of these happening will trigger a crash. I'm currently looking at how best to refactor the socket close code for APR. In addition to fixing the bug I'd like to make the intended behaviour of the code clearer for the next person who needs to make some changes. I'd like to try and make the changes as a series of small, incremental, easy to review changes. I'm not sure how feasible that will be. I hope to have this fixed in the next day or so. Mark --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org For additional commands, e-mail: dev-help@tomcat.apache.org