cxf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Diephouse" <...@envoisolutions.com>
Subject Re: Problems with Chunking
Date Tue, 20 Feb 2007 15:13:29 GMT
Hi Eoghan,

Comments inline...

On 2/20/07, Glynn, Eoghan <eoghan.glynn@iona.com> wrote:
>
>
> > -----Original Message-----
> > From: Dan Diephouse [mailto:dan@envoisolutions.com]
> > Sent: 19 February 2007 19:12
> > To: cxf-dev@incubator.apache.org
> > Subject: Problems with Chunking
> >
> > Hi All,
> >
> > I did some debugging over the weekend with a user and ASP.NET
> > seems to have problems if chunking isn't on. Here is the
> > response that comes when it is turned on:
> >
> > HTTP/1.1 400 Bad Request
> > Server: ASP.NET <http://asp.net/> Development Server/8.0.0.0
> > Date: Sat, 17 Feb 2007 07:55:29 GMT
> > X-AspNet-Version: 2.0.50727
> > Cache-Control: private
> > Content-Length: 0
> > Connection: Close
> >
> > It works fine however if chunking is turned off. There are
> > other servers as well that don't work with chunking, which is
> > why we ultimately turned off chunking.
> >
> > I want to suggest that either
> >
> > a) We turn off chunking by default.
> > b) We have some threshold for chunking. For instance, first
> > we stream up to 100K to a byte[] buffer. If there is still
> > more to write, we write the buffer and the rest of the
> > request as a chunked request. Otherwise it is written as a
> > non-chunked request.
>
> Well the problem with this approach is what happens if the request is
> >100k and the server-side happens to be ASP.NET? Since we fallback to
> chunking once the 100k threshold is reached, presumably the server-side
> will barf and we're back where we started.
>
> So I don't really like the idea of a band-aid that will work some of the
> time, but allow the old problem to creep back in when there's an
> unexpectedly large outgoing request.
>
> Ironically, we had a long discussion on this list some time back, with a
> lot of opposition expressed to the way the HTTP wrapper output stream
> buffers up the request payload up to the first flush(), so as to allow
> headers to be set by interceptors after the first payload write may have
> occurred.


I don't think its so ironic. My objection was on the server side. If you
recall I want the ability to do writes without creating new buffers so we
can do efficient XML routing. This was (is?) impossible because we were
always creating buffers though at the transport layer. Its on my list to
review what we currently have as I still think that using a
CachedOutputStream on the response is a little dodgy. We shouldn't need to
create a file or buffer for the response, and we should be able to just
write the headers on the first write().  Are there any cases where we're
creating HTTP headers between the start of writing a response and the first
flush()? I can't think of any. I think the big use case that was mentioned
was that theoretically something could go wrong while we first start
writing, and this would allow us to switch to writing a fault without any
consequences. But if we're already writing, chances are the damage is done,
and the fault is more low level - i.e. there is a problem with the stream.

For normal requests we will want a BufferedOutputStream for performance
reasons, but that is managed by Woodstox right now as it wraps the
OutputStream when you create an XMLStreamWriter.

But back to issue at hand ... I guess there are a few other situations
> in which turning off chunking and buffering up the request body would be
> useful, for example if we anticipate a 401 Basic Auth challenge or 30x
> redirect may occur.
>
> So here's a variation on your buffering idea ... instead of imposing an
> arbitrary 100k limit, say we allow unlimited buffering (with content
> over-flowing to a local temp file if the payload exceeds some size
> reasonable to keep in memory), but *only* if we have a reasonable
> expectation that the server may be unable to handle chunked incoming
> requests.
>
> This expectation could either be configured, if the client developer
> knows upfront that the server-side stack is buggy in this respect (and
> wow, it really is a fundamental bug, sortta begs the question what
> possesses folks to use such a thing ...).
>
> If the server-side stack is unknown, then the client could be configured
> to probe it upfront with an innocuous HTTP GET specifying the chunked
> transfer-encoding, but with an entity-body composed of exactly one empty
> chunk. If we get back a 400 response, we infer the server-side is
> chunking-intolerant and buffer up the real outgoing POSTs. If on the
> other hand, we get a 200, then we fallback to chunking.
>

Are the redirect/authentication cases in particular HTTP server bugs or
limitations of HTTP? It sounds like the later. I suppose we could keep a
list of Servers that we should should default to non-chunked, but it sounds
like that doesn't help the other cases.

How about this counter-counter proposal :-) It seems we have a lot of cases
which actually require non-chunked requests:
- broken servers
- authentication
- redirects

So why not turn off chunking by default and put in a log message which
states something to the extent of: "HTTP chunking is turned off by default
for compatability reasons. For possible performance improvements, try
enabling chunking."

For small requests (i.e. a couple K, and the most common), its likely to be
the same performance as woodstox wraps the outputstream in a
BufferedOutputStream.  Is performance the only reason you want it turned on
by default?

Regards,

- Dan

-- 
Dan Diephouse
Envoi Solutions
http://envoisolutions.com | http://netzooid.com/blog

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message