commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Mastroianni" <MMastroia...@choicestream.com>
Subject RE: HttpClient -- possible resource leak?
Date Wed, 09 Jun 2004 16:56:48 GMT
It doesn't terminate. Instead, it begins to take more and more cpu, and
just basically hangs.

It appears to be the case that these objects are not being gc-d.

If I want to get smaller chunks, I don't just call getResponseBody?

Whatever I set the maximum heap size to, I get close to it in a fairly
linear fashion, and then asymptotically approach the max while taking
more and more cpu.

Thanks,
Michael
-----Original Message-----
From: olegk@bluewin.ch [mailto:olegk@bluewin.ch] 
Sent: Wednesday, June 09, 2004 12:05 PM
To: Jakarta Commons Developers List
Subject: RE: HttpClient -- possible resource leak?

>5. I looked at performance monitor, and watched the amount of memory
>java    was using. When it got up to the limit, my program stopped
>downloading urls.

Michael,

What do you mean by "stopping downloading urls"? Does your application
terminate
with OutOfMemoryException or something?

My initial guess was that with a fairly high maximum heap size setting
the
garbage collector _might_ not kick in for quite a while until a certain
limit
is reached, thus making an impression of the application leaking memory.
However, if I understand you right, you are saying that objects are not
de-referenced
and therefore are not GC-ed?

Please do follow Odi's advice and do not buffer the content in memory.
Rather
use InputStream to read out data in smaller chunks and persist it to
disk.
That will drastically reduce the amount of garbage generated by
HttpClient

Oleg


>-- Original Message --
>Reply-To: "Jakarta Commons Developers List"
<commons-dev@jakarta.apache.org>
>Subject: RE: HttpClient -- possible resource leak?
>Date: Wed, 9 Jun 2004 11:46:09 -0400
>From: "Michael Mastroianni" <MMastroianni@choicestream.com>
>To: "Jakarta Commons Developers List" <commons-dev@jakarta.apache.org>
>
>
>Thanks for your help. Here are some details
>
>1. I've tried 2.1 final and 3.0 alpha: similar problems.
>2. JDK 1.4.2
>3. Windows XP pro
>4. I don't set the initial heap size, but I set the max to 500Meg
>5. I looked at performance monitor, and watched the amount of memory
>java    was using. When it got up to the limit, my program stopped
>downloading urls.
>
>I went through the version 2 code in the debugger, it looked as if the
>method's requestbody buffer was never getting cleaned up when I called
>releaseconnection on it.
>
>This was my big suspicion, because my memory usage seemed to be going
up
>pretty linearly with downloading, by an amount that seemed reasonable
>for a web page.
>
>I think I might be doing something drastically wrong, but I've read the
>docs, looked at the example code, and not seen anything obvious.
>
>Thanks again.
>
>Michael
>
>-----Original Message-----
>From: olegk@bluewin.ch [mailto:olegk@bluewin.ch] 
>Sent: Wednesday, June 09, 2004 11:03 AM
>To: Jakarta Commons Developers List
>Subject: RE: HttpClient -- possible resource leak?
>
>Michael
>
>Could you provide us with additional details on the execution
>environment
>of your application?
>
>(1) What version of HttpClient are you using?
>(2) What is the JDK version? 
>(3) What platform?
>(4) How exactly do you measure memory consumption by your application?
>(5) Do you set initial and maximum heap size for the JRE?
>
>Oleg
>
>
>>-- Original Message --
>>Reply-To: "Jakarta Commons Developers List"
><commons-dev@jakarta.apache.org>
>>Subject: HttpClient -- possible resource leak?
>>Date: Wed, 9 Jun 2004 10:43:10 -0400
>>From: "Michael Mastroianni" <MMastroianni@choicestream.com>
>>To: <commons-dev@jakarta.apache.org>
>>
>>
>>I have a multi-threaded app, using Httpclient to download a few
>thousand
>>urls at a time. Currently, I have one
>MultiThreadedHttpConnectionManager,
>>which the thread manager creates, and passes around to each of its
>worker
>>threads.
>>
>>Each thread has a queue of urls, and it creates a new HttpClient,
using
>the
>>ConnectionManager, for each one. I've tried using one, created at
>construction
>>time for each worker thread, and gotten no luck.
>>
>>The worker threads make executeMethod calls, and I notice that I'm
>leaking
>>a lot of memory (it looks like the memory usage goes up every time I
>successfully
>>download a page). It seems as if perhaps the underlying buffer of the
>GetMethod
>>is not being cleaned up. I'm calling release on the GetMethod in a
>finally
>>block. A relevant piece of code is below:
>>
>>            private void SpiderUrlImpl()
>>            {
>>                        HttpMethod method = new GetMethod(m_sUrl);
>>                        try
>>                        {
>>                                    //if(m_State == null)
>>                                    //{
>>                                                m_State = new
>HttpState();
>>
>m_State.setCookiePolicy(CookiePolicy.RFC2109);
>>                                    //}
>>            
>>                                    m_client.setState(m_State);
>>
>m_client.setConnectionTimeout(m_timeout);
>>
>>                                    method.setFollowRedirects(true);
>>                                    method.setStrictMode(false);
>>                                    String responseBody = null;
>>                                    
>>                                    int iCode    =
>m_client.executeMethod(method);
>>                                    responseBody =
>method.getResponseBodyAsString();
>>                                    Header hLoc  =
>method.getResponseHeader("Location");
>>                                    
>>                                    java.io.FileWriter fw = new
>java.io.FileWriter(m_sPath
>>+ "\\" + m_sFile);
>>                                    fw.write(responseBody);
>>                                    w.close();
>>                        }//TODO: LOG STUFF GOES HERE
>>                        catch
>(org.apache.commons.httpclient.HttpException
>>he)
>>                        {
>>                            System.err.println("Http error connecting
>to
>>'" + m_sUrl + "'");
>>                            System.err.println(he.getMessage());
>>                        }
>>                        catch (IOException ioe)
>>                        {
>>                            System.err.println("Unable to connect to
'"
>+
>>m_sUrl + "' or print file + '" +  m_sPath + "\\" + m_sFile + "'");
>>                            System.err.println(ioe.getMessage());
>>                        }
>>                        catch(Exception eExc)
>>                        {
>>                            System.err.println(eExc.getMessage());
>>                        }
>>                        finally
>>                        {
>>                            method.releaseConnection();
>>                        }
>>            }
>>}
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message