hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Berlin" <sber...@gmail.com>
Subject Use cases of DefaultConnectingIOReactor & NHttpClientHandlerBase w/ HttpRequestExecutionHandler?
Date Tue, 12 Feb 2008 20:24:01 GMT
Whew!  Long subject line...

I've been looking into exactly how all these components interact, and
it got me to wondering: how exactly are people using these objects (if
anyone is) right now?  The example NHttpClient is very simplistic in
that it just sends a basic GET request for "/" to the three hosts and
accepts any response.  Trying to turn the sample into a somewhat more
real-world scenario (say, a crawler) would seem to involve placing a
lot more information in the context of each request/response &
attachments.  For instance, expanding the example into a crawler would
require:

 1) On handleResponse, it parses the body for more links and adds them
as potential outgoing requests in the context.
 2) handleResponse somehow (?) triggers another submitRequest to be
called with the right context.
 3) submitRequest looks up the new context information and submits
more requests.
 4) submitRequest could limit the number of pipelined attempts to a
given host by storing more data in the context and
incrementing/decrementing the concurrent attempts, which
handleResponse would need to manage.

It becomes a little harder to make it work if you want to use a
variable number of connections and share the context.  I imagine there
would need to be some kind of CrawlContext that's shared as the
attachment among multiple connects and used within
handleResponse/submitRequest.

So...  my question is, is this line of thought correct?  Am I thinking
about the ConnectingReactor & ExecutionHandler the wrong way?  Should
there be some sort of more intricate tie-in between the request & the
response it generates?  (And, the built-in question, how can
handleResponse trigger another submitRequest to be called.)

Thanks!

Sam

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Mime
View raw message