hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Abley <james.ab...@gmail.com>
Subject Re: HttpCore NIO hurt by JDK bug?
Date Wed, 04 Aug 2010 12:32:18 GMT
Is that the original 1.6.0 release? I thought all later releases had a
version number of the form 1.6.0_xx, where you could tie the 'xx' to a minor
update listed here [1].

Cheers,

James

[1] http://www.oracle.com/technetwork/java/javase/releasenotes-136954.html

On 3 August 2010 06:00, Supun Kamburugamuva <supun06@gmail.com> wrote:

> Hi Harold,
>
> We are having the same problem and we are interested to know how you get
> this exact Java version. Is there a link to this particular Java version?
>
> Thanks,
> Supun..
>
> On Tue, Aug 3, 2010 at 1:57 AM, Harold Lee <harold@hotelling.net> wrote:
>
> > This seems to be fixed by a newer version of the JRE, i.e.
> >
> > java version "1.6.0"
> > Java(TM) SE Runtime Environment (build 1.6.0-b105)
> > Java HotSpot(TM) 64-Bit Server VM (build 1.6.0-b105, mixed mode)
> >
> > So I think that you can avoid the tricky workaround. Thank you for
> > your time and attention.
> >
> > Harold
> >
> > On Sat, Jul 31, 2010 at 7:11 AM, Oleg Kalnichevski <olegk@apache.org>
> > wrote:
> > > On Thu, 2010-07-15 at 12:50 -0700, Harold Lee wrote:
> > >> I've put together a simple HTTP server that resets the connection
> > >> after sending part of the response back to the client. I'm going to
> > >> try to recreate the bug (leaking sockets) by making many requests
> > >> against that server from a Linux box. I'll let you know what I find.
> > >>
> > >> Harold
> > >>
> > >
> > >
> > >
> > >> On Wed, Jul 14, 2010 at 1:44 AM, Oleg Kalnichevski <olegk@apache.org>
> > wrote:
> > >> > On Tue, 2010-07-13 at 13:32 -0700, Harold Lee wrote:
> > >> >> Regarding this JDK bug:
> > >> >> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933
> > >> >>
> > >> >> I think we are experiencing this using HttpCore on Linux with
Java
> > >> >> 1.6. We wind up leaking socket descriptors until the JVM process
> runs
> > >> >> out. We also wind up having to start a new reactor thread, which
> > >> >> creates a new Selector. The old reactor thread keeps running and
> the
> > >> >> thread dump shows it in sun.nio.ch.EPollArrayWrapper.epollWait
as
> > >> >> reported by others in the bug report above.
> > >> >>
> > >> >
> > >
> > >
> > > Hi Harold
> > >
> > > Did you have any luck reproducing the problem?
> > >
> > > I put together a work-around for the bug that causes the epoll spin
> > > problem [1]. If you are interested in trying it out I will happily
> share
> > > it with you. The work-around is pretty ugly, so I want to be sure there
> > > is no other way of solving the issue.
> > >
> > > cheers
> > >
> > > Oleg
> > >
> > > [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933
> > >
> > >> > Folks
> > >> >
> > >> > Anyone experienced anything like that? The looks pretty old, but
> there
> > >> > has been no reports of similar problems with HttpCore NIO. I am
> using
> > >> > Linux / JDK 1.6 on a daily basis when hacking on HttpCore but I have
> > not
> > >> > encountered such a problem yet.
> > >> >
> > >> >
> > >> >> Here's the change that the Glassfish team made to work around
this
> > JDK bug:
> > >> >>
> > >> >>
> >
> http://fisheye5.cenqua.com/browse/glassfish/appserv-http-engine/src/java/com/sun/enterprise/web/connector/grizzly/ByteBufferInputStream.java?r1=1.8&r2=1.9
> > >> >>
> > >> >> From my reading, the Glassfish code is much simpler than the
> HttpCore
> > >> >> NIO code: they're registering interest for just 1 socket and using
> > >> >> Selector.select() to wait for data from that socket. For HttpCore
> > NIO,
> > >> >> it isn't yet clear to me how we can detect which selector is
> > "trashed"
> > >> >> in order to cancel it and recreate it.
> > >> >>
> > >> >> I'm working on a workaround in AbstractMultiworkerIOReactor.java.
> If
> > >> >> selector.select returns 0 (setting readyCount to 0) then we don't
> > know
> > >> >> whether this bug hit us or we just had a timeout.
> > >> >
> > >> > The problem is that it is perfectly valid for a selector to return
0
> > >> > ready count. This condition alone is not sufficient to assume the
> > >> > selector is trashed.
> > >> >
> > >> >
> > >> >>  To be safe, I think
> > >> >> we need to close every registered SelectorKey and then call
> > >> >> selector.selectNow() to flush them. Then we can create a new
> > >> >> SelectorKey for each and reregister them. The only way to make
it
> > less
> > >> >> common, I think, is to use a long selectTimeout value so that
the
> > odds
> > >> >> of a timeout are low. Ugly, but I hope it will work.
> > >> >>
> > >> >
> > >> > This will unfortunately screw up handling of new / closed channels
> as
> > >> > well timeout logic.
> > >> >
> > >> > The work-around looks butt ugly and would require tons of fairly
> > complex
> > >> > code. Is there a way to reproduce the issue with a test scenario,
so
> > we
> > >> > could look for alternative approaches?
> > >> >
> > >> > Cheers
> > >> >
> > >> > Oleg
> > >> >
> > >> >
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
> > >> For additional commands, e-mail: dev-help@hc.apache.org
> > >>
> > >>
> > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
> > For additional commands, e-mail: dev-help@hc.apache.org
> >
> >
>
>
> --
> Tech Lead, WSO2 Inc
> http://wso2.org
> supunk.blogspot.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message