hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oleg Kalnichevski (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HTTPCORE-325) support custom implementations of SessionInputBuffer and SessionOutputBuffer
Date Fri, 04 Jan 2013 14:12:12 GMT

    [ https://issues.apache.org/jira/browse/HTTPCORE-325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543909#comment-13543909
] 

Oleg Kalnichevski commented on HTTPCORE-325:
--------------------------------------------

Noah
(1) I am not familiar with the goals and objectives of the Heritrix project but can't help
wondering if capturing HTTP headers down to each and every bit is _really_ that necessary?
It you were to reconstruct HTTP headers from CharArrayBuffers of individual headers you could
still get an almost exact original representation of the HTTP message head (save for the status
line whose original representation is not preserved by default (and even that could be mended))
and cut a lot of the complexity of your end by not having to mess around with the session
buffers and their internals.
(2) If spending a few more CPU cycles can be afforded (which should be the case for a web
crawler), why not simply scan for '\r\n\r\n' pattern? 

Oleg
                
> support custom implementations of SessionInputBuffer and SessionOutputBuffer
> ----------------------------------------------------------------------------
>
>                 Key: HTTPCORE-325
>                 URL: https://issues.apache.org/jira/browse/HTTPCORE-325
>             Project: HttpComponents HttpCore
>          Issue Type: Bug
>    Affects Versions: 4.3-alpha2
>            Reporter: Noah Levitt
>         Attachments: httpcore-325-20121231182846.diff
>
>
> In heritrix we have a set of classes that wrap streams and record them verbatim for replay.
One of the things it needs to do is make a note of where the http headers end and the message
body begins. In order to make this work with httpcomponents I found I needed custom implementations
of Session*Buffer. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Mime
View raw message