hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Kalnichevski <ol...@apache.org>
Subject Re: HttpAsyncClient bounded download size
Date Wed, 07 Dec 2016 09:36:25 GMT
On Tue, 2016-12-06 at 11:21 -0500, Joseph Naegele wrote:
> Hi folks,
> 
> How can I limit the amount of data downloaded for a request executed by the HttpAsyncClient
and still process the response as "completed" in the registered FutureCallback? The use case
is a large scale web crawler that truncates resources deemed too large.
> 
> I started by limiting the amount of data read from the response entity's InputStream,
however this doesn't work with the default BasicAsyncResponseConsumer, because it uses the
dynamically expanding SimpleInputBuffer to download the entire response entity.
> 
> I implemented my own HttpAsyncResponseConsumer, similar to the BasicAsyncResponseConsumer,
and tried using IOControl to signal shutdown once the I've read maximum desired number of
bytes, however this triggers a ConnectionClosedException. This is undesirable because I can't
distinguish it from other causes of ConnectionClosedExceptions, and I want to treat "truncated"
responses as completed in the registered FutureCallback (where I post-process the response).
> 
> Is there another method of implementing my desired functionality?
> 
> Thanks,
> Joe Naegele
> 

Hi Joe

Make your custom HttpAsyncResponseConsumer throw a custom exception,
like IHadEnoughException or some such, and see if it gets correctly
propagated to the result callback. If it does not, please raise an issue
in JIRA.

Oleg



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Mime
View raw message