httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From TOKI...@aol.com
Subject Re: HTTP Protocol Question
Date Thu, 17 Apr 2003 12:52:49 GMT

> In a message dated 4/17/2003 9:55:38 AM Central 
> Daylight Time, bill@wstoddard.com writes:

> Consider an application (CGI, servlet, whatever) that reads bytes sent 
> from the client in a chunked POST request then sends bytes back in a 
> chunked response.  Does HTTP/1.1 -require- the application to read -all- 
> the POST'ed bytes before sending the reply? In other words, can the 
> application read some bytes from the POST, construct and begin sending a 
> reply (chunked), read some more POST bytes, send some more reply 
> bytes, and so on indefinitely?  Does anything in RFC2616 prohibits this?
>
> Thanks,
> Bill

Not really... but you would certainly have to code
your own User-Agent that knows this is going to be
happening to get any useful benefit.

I know of no known browser that would 'understand'
this kind of exchange.

Even the latest versions of the major browsers are
still coded to be 'half-duplex' and are still
basically just treating the protocol as 
'push to talk' ( I say something, you say something )
with the logical granularity confined to REQUEST
( entire REQUEST/POST sent ) and then RESPONSE
( entire RESPONSE read ).

The 'exception' is '100 Continue'.

As far as I know... that is the ONLY conditional in
the RFC's which actually 'requires' a User-Agent to
be in full-duplex versions half-duplex while 
sending/receiving REQUEST/RESPONSE

If the browser sends 'Expect: 100' then it is 
SUPPOSED to start listening for the '100 Continue'
before it starts uploading POST data.

What's really bizarre, however, is that even if/when
most major browsers send 'Expect: 100' then don't
do that. They just go ahead and blast away with
the POST data before ever giving the Server a
chance to actually send the '100 Continue'.

To make matters worse... there is also a blurb in
the RFC's which says that a Server is allowed to
send a '100 Continue' even if the User-Agent
didn't actually ASK for one with 'Expect: 100'.

This has, historically, caused all kinds of 
problems since most browsers and proxies are in
no way coded to handle this.

IIS is famous for putting out all these
'Unexpected 100 Continues' and while it is
LEGAL to do so according to RFC's... it has
broken many, many a Proxy and User-Agent.

Most Proxies and User-Agents are in now way
listening 'full-duplex' while they are posting
and the only time they will discover that the
Server was trying to 'talk back' while they
were POSTING data is when they completely finish
transmitting the POST data ( all of it ) and 
then they turn around to see what the Server
might have to say.

Only then ( AFTER all the POST data is already
sent ) do they discover the unexpected '100 Continue'
already sitting in their receive buffer(s).

This would 'break' all kinds of things because
until people started coding to throw away the
'100 continues' it would be interpreted as
the RESPONSE to the POST. Not good.

A lot of high-dollar security products like
SiteMinder had huge problems with Unexpected
100 Continues showing up in full-duplex mode
when they were assuming it was all supposed
to be a half-duplex conversation. Their 
solution was not to change to full-duplex
but to simply 'throw away' any unexpected
100 Continues that might be sitting in the
receive buffers after all the POSTING is done.

I believe that's the way SQUID now handles
'unexpected 100 Continues' as well. It's not
'listening' in full-duplex at all and it just
knows to throw them away when it finally 
turns around and starts 'listening' for
the RESPONSE.

In the case of ERROR responses showing up
ASAP... this might make it LOOK like major
browsers are actually doing 'full duplex'
send/receives but it's just an illusion.

What's really happening there is that if
a browser has already started blasting 
away with POST data but the Server has
already decided 'No soup for you' and 
sent an ERROR page then it's not that
the browser has actually SEEN the error
response coming back in-between POST
data sends... it's actually because the
Server has CLOSED the connection as a 
result of the error and the next POST
send on the browser side gets a transmit
error and only then discovers the 
error response already sitting in its
receive buffer for that transaction.

I don't know much about go-bots, search-bots,
and/or scrape-bots. They might actually be
in full-duplex versus half-duplex like
browsers. Dunno.

Later...
Kevin




Mime
View raw message