hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Kalnichevski <ol...@apache.org>
Subject Re: Blocked threads during requests
Date Sun, 10 Jan 2010 11:35:21 GMT
sebb wrote:
> On 10/01/2010, Oleg Kalnichevski <olegk@apache.org> wrote:
>> sebb wrote:
>>
>>> On 09/01/2010, Ken Krugler <kkrugler_lists@transpac.com> wrote:
>>>
>>>> [In the interest of not hijacking Tony's discussion thread, I'm putting
>> this
>>>> into a new email.]
>>>>
>>>>
>>>>
>>>>> Tony Poppleton wrote:
>>>>>
>>>>>
>>>>>> Hi,
>>>>>> Further to the previous mail, I have already implemented my own
>>>>>>
>>>> AbstractHttpEntity to eliminate a byte[] copy.  And I have seen the NIO
>>>> implementations of HttpEntities, however they don't seem to copy using
>> NIO
>>>> methods so they won't be any faster than the standard IO
>> implementations.
>>>>>> Anyway, it seems I have to go a level deeper than this class to be
>> able
>>>> to do the NIO copy.  Is this the right direction to be digging in?
>>>>
>>>>>> Thanks,
>>>>>> Tony
>>>>>>
>>>>>>
>>>>> Tony
>>>>>
>>>>> Contrary to a common misconception, NIO is significantly slower than
>> the
>>>> classic blocking I/O in terms of raw data throughput. Modern operating
>>>> systems and JVMs have become pretty efficient at switching thread
>> contexts.
>>>> Connection multiplexing starts paying off only when the number of
>> concurrent
>>>> connections exceeds 2000 or direct data streaming from or to a file is
>> used.
>>>>  I agree that NIO is often incorrectly viewed as a panacea for all
>> network
>>>> performance issues.
>>>>
>>>>  I did want to mention that there are some multi-threading performance
>>>> issues which potentially NIO would avoid, for those who are using
>> HttpClient
>>>> with 100s of threads.
>>>>
>>>>  For example, during a Bixo crawl with 300 threads, I was doing regular
>>>> thread dumps and inspecting the results. A very high percentage
>> (typically >
>>>> 1/3) were blocked while waiting to get access to the cookie store. By
>>>> default there's only one of these per HttpClient.
>>>>
>>>>  This one was fairly easy to work around, by creating a cookie store in
>> the
>>>> local context for each request:
>>>>
>>>>            CookieStore cookieStore = new BasicCookieStore();
>>>>
>>>> localContext.setAttribute(ClientContext.COOKIE_STORE,
>>>> cookieStore);
>>>>
>>>>  But I've run into a few other synchronized method/data bottlenecks,
>> which
>>>> I'm still working through. For example, at irregular intervals the bulk
>> of
>>>> my fetcher threads are blocked on getting the scheme registry, either:
>>>>
>>>>  "pool-1-thread-9478" prio=10 tid=0x8e9ec400 nid=0x1fb waiting for
>> monitor
>>>> entry [0x8ee2e000]
>>>>   java.lang.Thread.State: BLOCKED (on object monitor)
>>>>        at
>>>>
>> org.apache.http.conn.scheme.SchemeRegistry.get(SchemeRegistry.java:106)
>>>>        - waiting to lock <0x93f2c0c8> (a
>>>> org.apache.http.conn.scheme.SchemeRegistry)
>>>>        at
>>>>
>> org.apache.http.client.protocol.RequestAddCookies.process(RequestAddCookies.java:154)
>>>>        at
>>>>
>> org.apache.http.protocol.BasicHttpProcessor.process(BasicHttpProcessor.java:251)
>>>>        at
>>>>
>> org.apache.http.protocol.HttpRequestExecutor.preProcess(HttpRequestExecutor.java:168)
>>>>        at
>>>>
>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:422)
>>>>  or
>>>>
>>>>  "pool-1-thread-9470" prio=10 tid=0x8e9e7c00 nid=0x1f1 waiting for
>> monitor
>>>> entry [0x8d986000]
>>>>   java.lang.Thread.State: BLOCKED (on object monitor)
>>>>        at
>>>>
>> org.apache.http.conn.scheme.SchemeRegistry.getScheme(SchemeRegistry.java:71)
>>>>        - waiting to lock <0x93f2c0c8> (a
>>>> org.apache.http.conn.scheme.SchemeRegistry)
>>>>        at
>>>>
>> org.apache.http.impl.conn.DefaultHttpRoutePlanner.determineRoute(DefaultHttpRoutePlanner.java:111)
>>>>        at
>>>>
>> org.apache.http.impl.client.DefaultRequestDirector.determineRoute(DefaultRequestDirector.java:619)
>>>>        at
>>>>
>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:319)
>>>>  If anybody (well, OK, Oleg) has input on things I could be doing wrong
>> to
>>>> trigger this type of behavior, and/or ways to avoid it, I'm all ears.
>>>>
>>> Looks like the code could use ConcurrentHashMap instead of LinkedHashMap?
>>> All the methods could then be unsynchronised.
>>>
>>> The only method which would be affected by the change in ordering is
>>> getSchemeNames(). The Javadoc for this is a bit unclear (to me) but
>>> the test case shows that insertion order is not important (so I'm not
>>> sure why LinkedHashMap was used originally).
>>>
>>>
>>  Good catch, Sebastian!
>>
>>  I am pretty certain ordering does not matter. I am not longer sure why
>> LinkedHashMap was chosen in the first place.
>>
>>  Would you have time for looking into HTTPCLIENT-903?
> 
> Yes.
> 
> I can update the code in trunk - or would you prefer to review patches first?
> 

Just hack away ;-)

cheers

Oleg

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Mime
View raw message