hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vigna <vi...@di.unimi.it>
Subject Re: AbstractNIOConnPool memory leak?
Date Sat, 05 Jan 2013 21:33:34 GMT
> But why would you want a web crawler to have 10-20K simultaneously 
> opened connections in the first place? 

(I thought I answered this, but it's not on the archive. Boh.)

Having a few thousands connection open is the only way to retrieve data
respecting politeness (e.g., not banging the same site too often).

I have another question: is there any suggestion for parameters of the
asynchronous client in case of several thousands parallel requests (e.g.,
for the IOReactor)? We are experimenting both with DefaulHttpClient and
DefaultHttpAsyncClient, and with the same configuration (e.g., 4000 threads
using DefaultHttpClient or 64 threads pushing 4000 async requests into a
default DefaultHttpAsyncClient) we see completely different behaviours. The
sync client fetches more than 10000 pages/s, the async client speed fetches
50 p/s. Should we increase the number of threads or the I/O interval of the

View this message in context: http://httpcomponents.10934.n7.nabble.com/AbstractNIOConnPool-memory-leak-tp18554p18641.html
Sent from the HttpClient-User mailing list archive at Nabble.com.

To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org

View raw message