hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hiranya Jayathilaka <hiranya...@gmail.com>
Subject Re: HttpCore NIO hurt by JDK bug?
Date Mon, 01 Nov 2010 19:32:25 GMT
On Tue, Nov 2, 2010 at 12:48 AM, swatkatz <mohanrao.01@gmail.com> wrote:
>
> Hello,
>
> We seem to be experiencing this as well when using NIO. We are using JDK 1.6
> Update 21.

This bug should be fixed in JDK 1.6 build 21. At least that's what all
the evidence suggest. We haven't been able to reproduce the issue on
this particular JDK version ever.

Thanks,
Hiranya

 Any ideas what the workaround/fix is ?
>
> Regards,
> Mohan
>
>
>
> olegk wrote:
>>
>> On Thu, 2010-07-15 at 12:50 -0700, Harold Lee wrote:
>>> I've put together a simple HTTP server that resets the connection
>>> after sending part of the response back to the client. I'm going to
>>> try to recreate the bug (leaking sockets) by making many requests
>>> against that server from a Linux box. I'll let you know what I find.
>>>
>>> Harold
>>>
>>
>>
>>
>>> On Wed, Jul 14, 2010 at 1:44 AM, Oleg Kalnichevski <olegk@apache.org>
>>> wrote:
>>> > On Tue, 2010-07-13 at 13:32 -0700, Harold Lee wrote:
>>> >> Regarding this JDK bug:
>>> >> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933
>>> >>
>>> >> I think we are experiencing this using HttpCore on Linux with Java
>>> >> 1.6. We wind up leaking socket descriptors until the JVM process runs
>>> >> out. We also wind up having to start a new reactor thread, which
>>> >> creates a new Selector. The old reactor thread keeps running and the
>>> >> thread dump shows it in sun.nio.ch.EPollArrayWrapper.epollWait as
>>> >> reported by others in the bug report above.
>>> >>
>>> >
>>
>>
>> Hi Harold
>>
>> Did you have any luck reproducing the problem?
>>
>> I put together a work-around for the bug that causes the epoll spin
>> problem [1]. If you are interested in trying it out I will happily share
>> it with you. The work-around is pretty ugly, so I want to be sure there
>> is no other way of solving the issue.
>>
>> cheers
>>
>> Oleg
>>
>> [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933
>>
>>> > Folks
>>> >
>>> > Anyone experienced anything like that? The looks pretty old, but there
>>> > has been no reports of similar problems with HttpCore NIO. I am using
>>> > Linux / JDK 1.6 on a daily basis when hacking on HttpCore but I have
>>> not
>>> > encountered such a problem yet.
>>> >
>>> >
>>> >> Here's the change that the Glassfish team made to work around this JDK
>>> bug:
>>> >>
>>> >>
>>> http://fisheye5.cenqua.com/browse/glassfish/appserv-http-engine/src/java/com/sun/enterprise/web/connector/grizzly/ByteBufferInputStream.java?r1=1.8&r2=1.9
>>> >>
>>> >> From my reading, the Glassfish code is much simpler than the HttpCore
>>> >> NIO code: they're registering interest for just 1 socket and using
>>> >> Selector.select() to wait for data from that socket. For HttpCore NIO,
>>> >> it isn't yet clear to me how we can detect which selector is "trashed"
>>> >> in order to cancel it and recreate it.
>>> >>
>>> >> I'm working on a workaround in AbstractMultiworkerIOReactor.java. If
>>> >> selector.select returns 0 (setting readyCount to 0) then we don't know
>>> >> whether this bug hit us or we just had a timeout.
>>> >
>>> > The problem is that it is perfectly valid for a selector to return 0
>>> > ready count. This condition alone is not sufficient to assume the
>>> > selector is trashed.
>>> >
>>> >
>>> >>  To be safe, I think
>>> >> we need to close every registered SelectorKey and then call
>>> >> selector.selectNow() to flush them. Then we can create a new
>>> >> SelectorKey for each and reregister them. The only way to make it less
>>> >> common, I think, is to use a long selectTimeout value so that the odds
>>> >> of a timeout are low. Ugly, but I hope it will work.
>>> >>
>>> >
>>> > This will unfortunately screw up handling of new / closed channels as
>>> > well timeout logic.
>>> >
>>> > The work-around looks butt ugly and would require tons of fairly
>>> complex
>>> > code. Is there a way to reproduce the issue with a test scenario, so we
>>> > could look for alternative approaches?
>>> >
>>> > Cheers
>>> >
>>> > Oleg
>>> >
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
>>> For additional commands, e-mail: dev-help@hc.apache.org
>>>
>>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
>> For additional commands, e-mail: dev-help@hc.apache.org
>>
>>
>>
>
> --
> View this message in context: http://old.nabble.com/HttpCore-NIO-hurt-by-JDK-bug--tp29155405p30107703.html
> Sent from the HttpComponents-Dev mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
> For additional commands, e-mail: dev-help@hc.apache.org
>
>



-- 
Hiranya Jayathilaka
Senior Software Engineer;
WSO2 Inc.;  http://wso2.org
E-mail: hiranya@wso2.com;  Mobile: +94 77 633 3491
Blog: http://techfeast-hiranya.blogspot.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Mime
View raw message