hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alvarez, Gil" <...@pogo.com>
Subject RE: question about performance
Date Fri, 09 Apr 2004 18:28:51 GMT
How about this, it turns out that the different timeouts are a small
set, some want 10 secs, others 30 secs, others 1 minute. So I can keep a
pool of HttpClient objects, one per timeout. This way, each HttpClient
object can be configured with its own timeout.

But will I still get performance benefits if I use HttpClient in this
way? And will it behave correctly? I do require a timeout to be set, and
I will have multiple concurrent requests executing within one
HttpClient. Will the behavior be as if I specified a timeout per
request, or am I going to get weird behavior (eg, I set a timeout of 10,
request R1 starts at T1, R2 starts at T1+5, and R2 times out at T1+10,
which would be wrong). Basically do you fire up that controller thread
per request, or is there just one per HttpClient object?

-----Original Message-----
From: Oleg Kalnichevski [mailto:olegk@apache.org] 
Sent: Friday, April 09, 2004 6:58 AM
To: Commons HttpClient Project
Subject: RE: question about performance

Gil,
The problem is that until Java 1.4 there has simply been no way to
ensure connection timeout. HttpClient only 'mimics' connect timeout at
the expense of having a controller thread watch over the process of
socket initialization. The controller thread attempts to instantiate a
socket for a given period time, and if that fails, the controller thread
simply drops the socket on the floor, leaving it up to the garbage
collector to clean up the mess. This all is very expensive in terms of
resource consumption / memory allocation / garbage collection. Knowing
well about this problem we have put a lot of effects into trying to
reuse connections as much as possible. This approach works only if you
keep HttpClient along with its connection manager alive. Creating an
HttpClient instance per request completely defeats connection re-use and
results in excessive creation/garbage-collection of objects. 

> The only setTimeout() calls that I can find are in HttpClient, but
I'll
> have multiple concurrent requests that will want different timeouts.
How
> do I set a timeout per request?
> 

The problem is that 2.0 API does not allow to control timeouts on per
request basis. There's an open ticket for this bug

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24154

We are planning to fix the problem for the 3.0 release. You are
absolutely certain you do need different timeout values on per request
basis I can even provide a fix for it this weekend. There are also plans
to add support for 1.4 connect timeout through reflection to circumvent
the problem by eliminating the controller thread when running in newer
JDKs. The catch there you'd have to use unstable branch of HttpClient
which still in pre Alpha1 state.

Oleg


> -----Original Message-----
> From: Oleg Kalnichevski [mailto:olegk@apache.org] 
> Sent: Thursday, April 08, 2004 1:20 PM
> To: Commons HttpClient Project
> Subject: RE: question about performance
> 
> Gil,
> HttpClient#getHost / HttpClient#getPort return the DEFAULT host and
port
> used when only relative request path is given
> 
> HttpClient agent = new HttpClient();
> GetMethod get1 = new GetMethod("/relative/whatever.html");
> // default host configuration applies
> GetMethod get2 = new
> GetMethod("http://www.whatever.com/absolute/whatever.html");
> 
> Oleg
> 
> 
> 
> On Thu, 2004-04-08 at 22:01, Alvarez, Gil wrote:
> > Ok, I considered reusing HttpClient, but when I saw methods such as
> > HttpClient.getHost() and getPort(), they implied that at the very
> least
> > it's not a thread safe class to use. If i have multiple threads
> > executing within one HttpClient object at the same time, and I call
> > HttpClient.getHost(), what's going to happen?
> > 
> > -----Original Message-----
> > From: Oleg Kalnichevski [mailto:olegk@apache.org] 
> > Sent: Thursday, April 08, 2004 12:23 PM
> > To: Commons HttpClient Project
> > Subject: Re: question about performance
> > 
> > Gil,
> > (1) First and foremost DO reuse HttpClient instances when using
> > multi-threaded connection manager. HttpClient class is thread-safe.
In
> > fact there are no known problems with having just one instance of
> > HttpClient per application. Using a new instance of HttpClient for
> > processing each request totally defeats all the performance
> > optimizations we have built into HttpClient
> > 
> > (2) Use multi-threaded connection manager in case you do not
> > 
> > (3) Disable stale connection check
> > 
> > (4) Do not use connect timeout which causes a controller thread to
be
> > spawned per connection attempt
> > 
> > Oleg
> > 
> > On Thu, 2004-04-08 at 21:02, Alvarez, Gil wrote:
> > > We recently ported our url-hitting code from using java.net.* code
> to
> > > httpclient code. We use it in a high-volume environment (20
machines
> > are
> > > hitting an external 3rd party to retrieve images).
> > > 
> > >  
> > > 
> > > 
> > > 
> > > After the port, we saw a significant increase in cycles used by
the
> > > machines, about 2-3 times (ie, the load on the boxes increased
from
> > > using up 20% of the cpu, to about 50%-60% of the cpu.
> > > 
> > >  
> > > 
> > > For each request, we instantiate an HttpClient object, and a
> GetMethod
> > > object, and shut things down afterwards.
> > > 
> > >  
> > > 
> > > In order to reduce the use of cycles, what is the recommended
> > approach?
> > > 
> > >  
> > > 
> > > Thank you.
> > > 
> > 
> > 
> >
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> > commons-httpclient-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> > commons-httpclient-dev-help@jakarta.apache.org
> > 
> > 
> > 
> >
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> commons-httpclient-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> commons-httpclient-dev-help@jakarta.apache.org
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> commons-httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> commons-httpclient-dev-help@jakarta.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
commons-httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
commons-httpclient-dev-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail:
commons-httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail:
commons-httpclient-dev-help@jakarta.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-httpclient-dev-help@jakarta.apache.org


Mime
View raw message