hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Clark (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HTTPCORE-155) Performance issues with IBM JRE 6.0
Date Thu, 25 Dec 2008 00:06:44 GMT

    [ https://issues.apache.org/jira/browse/HTTPCORE-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659172#action_12659172
] 

mclark edited comment on HTTPCORE-155 at 12/24/08 4:05 PM:
------------------------------------------------------------------

Sam,

Thanks for that explanation, it helped me a lot.  I agree that a change in the VM implementation
would be the cleanest resolution to this particular problem.   That, combined with a tightening
up of the SelectionKey contract to remove some of the "implementation dependent" ambiguity
that the contract allows for.  Unfortunately, we are probably stuck with that contract for
a while, as it's been there for Java 1.4, 5 and now 6.

A couple things that your message clarified for me:

1) I had not seen the doc comment about naive vs. high-performance implementations
1.1) And so, I did not realize that a reduction in the length of time that the #interestOps(int)
method blocks, can be achieved by ensuring that #interestOps(int) is invoked only when no
thread(s) are in #select().  Well, at least the docs hint fairly strongly at this.

2) It explains to me why Marc's patch relieves the symptoms of this particular problem: by
making sure that #interestOps(int) is invoked only by the thread that is #select()ing, he
can serialize the calls to #interestOps(int) and #select(), which ensures that their invocation
is mutually exclusive.

And so I think Marc's patch is getting close to the workaround you outlined, Sam?  Now that
I understand the problem more thoroughly (hopefully), I can perhaps offer some further review
on his changes:

As I mentioned before, Marc's patch is ignoring the selectTimeout member var of AbstractIOReactor
-- this will need to be addressed somehow (probably by reintroducing its use.)  But why did
he make that change?  The only reason I can see for such an aggressive selectTimeout as 1ms,
is the need to have the reactor wake up, retrieve, and process pending interestOps queue items,
even when the reactor has no reason to wake up out #select.  

Under Marc's changes, requests to change interestOps will not actually take effect until the
reactor wakes up for some reason and processes the interestOps queue.  Since enqueueing an
interestOps request does not wake the reactor, it is possible for very many interestOps requests
to pile up and not be processed in a timely fashion -- that is, if the reactor is asleep in
#select(long), with no reason to wake up.  So, Marc has set the select timeout very low, which
guarantees that the reactor will give the interestOps queue a great deal of attention, which
helps get those interestOps requests processed in a timely manner.  However, all this spinning
in the reactor execution loop is inefficient.  Instead of spinning/polling in the reactor,
would it work to invoke Selector#wakeup immediately after enqueueing an interestOps queue
item?  Would this help the reactor could wake up in a timely fashion whenever there are interestOps
items in queue, but be able stay asleep if there are 
 none?

regards,

Mike

      was (Author: mclark):
    Sam,

Thanks for that explanation, it helped me a lot.  I agree that a change in the VM implementation
would be the cleanest resolution to this particular problem.   That, combined with a tightening
up of the SelectionKey contract to remove some of the "implementation dependent" ambiguity
that the contract allows for.  Unfortunately, we are probably stuck with that contract for
a while, as it's been there for Java 1.4, 5 and now 6.

A couple things that your message clarified for me:

1) I had not seen the doc comment about naive vs. high-performance implementations
1.1) And so, I did not realize that a reduction in the length of time that the #interestOps(int)
method blocks, can be achieved by ensuring that #interestOps(int) is invoked only when no
thread(s) are in #select().  Well, at least the docs hint fairly strongly at this.

2) It explains to me why Marc's patch relieves the symptoms of this particular problem: by
making sure that #interestOps(int) is invoked only by the thread that is #select()ing, he
can serialize the calls to #interestOps(int) and #select(), which ensures that their invocation
is mutually exclusive.

And so I think Marc's patch is getting close to the workaround you outlined, Sam?  Now that
I understand the problem more thoroughly (hopefully), I can perhaps offer some further review
on his changes:

As I mentioned before, Marc's patch is ignoring the selectTimeout member var of AbstractIOReactor
-- this will need to be addressed somehow (probably by reintroducing its use.)  But why did
he make that change?  The only reason I can see for such an aggressive selectTimeout as 1ms,
is the need to have the reactor wake up, retrieve, and process pending interestOps queue items,
even when the reactor has no reason to wake up out #select.  

Requests to change interestOps will not actually take effect until the reactor wakes up for
some reason and processes the interestOps queue.  Since enqueueing interestOps requests does
not wake the reactor, it is possible for very many interestOps requests pile up and not be
processed in a timely fashion -- if the reactor is asleep in #select(long), with no reason
to wake up.  So, Marc has set the select timeout very low, which guarantees that the reactor
will give the selectionOps queue a great deal of attention, which helps get those interestOps
requests processed in a timely manner.  However, all this spinning in the reactor execution
loop is inefficient.  Instead of spinning/polling in the reactor, would it work to invoke
Selector#wakeup immediately after enqueueing an interestOps queue item?  That way the reactor
could wake up in a timely fashion whenever there are interestOps items in queue, but be able
stay asleep if there are none.

regards,

Mike
  
> Performance issues with IBM JRE 6.0
> -----------------------------------
>
>                 Key: HTTPCORE-155
>                 URL: https://issues.apache.org/jira/browse/HTTPCORE-155
>             Project: HttpComponents HttpCore
>          Issue Type: Bug
>          Components: HttpCore NIO
>    Affects Versions: 4.0-beta1
>         Environment: Windows 2003 SP2 - IBM J2RE 1.6.0 build 2.4 - HTTPCore Beta1 - Dual
Core CPU 3.0Ghz - 1Gbps networking
>            Reporter: Tom McSorley
>             Fix For: 4.1
>
>         Attachments: AbstractIOReactor.diff, AbstractIOReactor.java, IOSessionImpl.diff,
IOSessionImpl.java, javacore.20081203.153723.32300.0001.txt, patch.08-12-17.tar.gz, patch.08-12-18.tar.gz,
patch.08-12-22.tar.gz
>
>
> I'm issuing a second HTTP Request on a connection that has very recently returned a null
for the submitRequest() call...  this 2nd request is being issued approximately 500ms after
the submitRequest() null is returned... so the connection has just been established, an HTTP
Request/Response-200 cycle has completed just prior to this 2nd request being issued.  I'm
seeing unusually long delays in the requestOutput() call (verified by surrounding timing prints)...
that can range anywhere from a few milliseconds on up to 60 seconds...  It eventually unwinds,
and then the submitRequest() is called... this 2nd request is dispatched and works fine...
but, it is delayed considerably...  Is this a known issue and is there a possible work-around?
> Here's the JVM related thread information:
> The thread being delayed and stuck in the requestOutput() call for a long time (mostly
longer than 5 seconds):
> 3XMTHREADINFO      "pool-2-thread-5" TID:0x2AEECE00, j9thread_t:0x2A7189A8, state:B,
prio=5
> 3XMTHREADINFO1            (native thread ID:0x1B44, native priority:0x5, native policy:UNKNOWN)
> 4XESTACKTRACE          at sun/nio/ch/SelectionKeyImpl.interestOps(SelectionKeyImpl.java:60)
> 4XESTACKTRACE          at org/apache/http/impl/nio/reactor/IOSessionImpl.setEvent(IOSessionImpl.java:113)
> 4XESTACKTRACE          at org/apache/http/impl/nio/NHttpConnectionBase.requestOutput(NHttpConnectionBase.java:158)
> .... (non important stack information removed)
> 4XESTACKTRACE          at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
> 4XESTACKTRACE          at java/lang/Thread.run(Thread.java:735)
> Here's the monitor that this thread is blocked and waiting on:
> 2LKMONINUSE      sys_mon_t:0x2A708AF8 infl_mon_t: 0x2A708B30:
> 3LKMONOBJECT       sun/nio/ch/Util$1@00B09208/00B09214: Flat locked by "I/O dispatcher
7" (0x2A208E00), entry count 1
> 3LKWAITERQ            Waiting to enter:
> 3LKWAITER                "pool-2-thread-5" (0x2AEECE00)
> And here's the thread that currently has this monitor locked:
> 3XMTHREADINFO      "I/O dispatcher 7" TID:0x2A208E00, j9thread_t:0x2A6EC73C, state:R,
prio=5
> 3XMTHREADINFO1            (native thread ID:0x830, native priority:0x5, native policy:UNKNOWN)
> 4XESTACKTRACE          at sun/nio/ch/WindowsSelectorImpl$SubSelector.poll0(Native Method)
> 4XESTACKTRACE          at sun/nio/ch/WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:308(Compiled
Code))
> 4XESTACKTRACE          at sun/nio/ch/WindowsSelectorImpl$SubSelector.access$500(WindowsSelectorImpl.java(Compiled
Code))
> 4XESTACKTRACE          at sun/nio/ch/WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:162(Compiled
Code))
> 4XESTACKTRACE          at sun/nio/ch/SelectorImpl.lockAndDoSelect(SelectorImpl.java:69(Compiled
Code))
> 4XESTACKTRACE          at sun/nio/ch/SelectorImpl.select(SelectorImpl.java:80(Compiled
Code))
> 4XESTACKTRACE          at org/apache/http/impl/nio/reactor/AbstractIOReactor.execute(AbstractIOReactor.java:121)
> 4XESTACKTRACE          at org/apache/http/impl/nio/reactor/BaseIOReactor.execute(BaseIOReactor.java:70)
> 4XESTACKTRACE          at org/apache/http/impl/nio/reactor/AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:318)
> 4XESTACKTRACE          at java/lang/Thread.run(Thread.java:735)
> I should also note that we're attempting to use 1000 client instances on this single
system... each with potentially 2 active connections simultaneously... there is also virtually
no CPU load (i.e. less then 5%) on this system...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Mime
View raw message