hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Taft <michaelt...@earthlink.net>
Subject Re: getResponseBodyAsStream
Date Sat, 27 Nov 2004 18:00:20 GMT
Duncan & Oleg -
Thanks for the code frag. It's been inserted into my app, and is working 
beautifully.
But, of course, Oleg's comment begs the question. I'm now reading the 
response as a stream, but still parsing it as "one huge String." I based 
my HTML parser on the ones in The Java Tutorial, which all use a String 
as input, so I assumed (ah... the Problem) that feeding in one huge 
String was the way to go.

My method calls look like this:

setParser.parseSetPage(readFully(new 
InputStreamReader(get.getResponseBodyAsStream(), 
get.getRequestCharSet())));

where: parseSetPage(String str)

Can you point me in the direction of an online example where the page is 
fed to the parser chunk by chunk?

M.

Oleg Kalnichevski wrote:

> Duncan & Michael,
> 
> This is precisely the way we recommend the response body be consumed.
> 
> The whole idea is that one should REALLY avoid converting the response
> body to a String unless absolutely necessary. One should really be
> consuming the response body as a byte or char stream, which will result
> in much, much more memory efficient code. For instance, if the content
> body ultimately gets fed to an HTML parser or a scanner, it is by far
> more efficient to feed it through a Reader in smaller chunks rather than
> as one huge String
> 
> There's one little change which I would have made, though:
> 
> readFully(
> new InputStreamReader(
>   get.getResponseBodyAsStream(), 
>   get.getResponseCharSet()));
> 
> Otherwise, everything looks cool
> 
> Cheers,
> 
> Oleg
> 
> 
> On Sat, 2004-11-27 at 10:05 +0000, Duncan McGregor wrote:
> 
>>It will kind of work, although readLine discards the line end character, which
>>you might well want when parsing the string. And you may want to consider the
>>character set used in the InputStreamReader.
>>
>>Coincidentally I wrote this code yesterday
>>
>>    public static String readFully(Reader input) throws IOException {
>>        BufferedReader bufferedReader = input instanceof BufferedReader 
>>	        ? (BufferedReader) input
>>	        : new BufferedReader(input);
>>        StringBuffer result = new StringBuffer();
>>        char[] buffer = new char[4 * 1024];
>>        int charsRead;
>>        while ((charsRead = bufferedReader.read(buffer)) != -1) {
>>            result.append(buffer, 0, charsRead);
>>        }	        
>>        return result.toString();
>>    }
>>
>>Call this with doc = readFully(new
>>InputStreamReader(get.getResponseBodyAsStream(), YOURCHARSET));
>>
>>Another good bet would be Jakarta Commons IO  - IOUtils.toString(Reader)
>>
>>Duncan Mc^Gregor
>>The name rings a bell
>>www.oneeyedmen.com
>> 
>>
>>-----Original Message-----
>>From: Michael Taft [mailto:michaeltaft@earthlink.net] 
>>Sent: 27 November 2004 07:03
>>To: HttpClient User Discussion
>>Subject: getResponseBodyAsStream
>>
>>HttpClient keeps begging me to use getResponseBodyAsStream, rather than
>>getResponseBodyAsString, due to the size of the response body. I'm willing to do
>>this, even if just to make it happy. However, as a total newbie, I'm not clear
>>about the best way to take a response stream and turn it into a string (that I
>>can then parse, which is what I'm up to).
>>
>>I realize this is a trivial task for most of you. Here is how I propose to do
>>it:
>>
>>------
>>
>>StringBuffer buffer = new StringBuffer(); try { InputStream is =
>>get.getResponseBodyAsStream(); BufferedReader in = new BufferedReader(new
>>InputStreamReader(is)); String str = "";
>>	while(str != null)
>>	{
>>		str = in.readLine();
>>		buffer.append(str);
>>	}
>>} catch(IOException e)
>>(
>>		...etc.
>>}
>>
>>------
>>
>>My questions about this are:
>>1) Will this work?
>>2) Is there a better way to do it?
>>
>>Thanks.
>>M.
>>
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-user-help@jakarta.apache.org
> 
> 

-- 
Michael W. Taft
Writer/Editor
4614 Finley Avenue, #3
Los Angeles, CA 90027
(323)663-6042

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-user-help@jakarta.apache.org


Mime
View raw message