To: new-httpd@apache.org
Subject: Re: [PATCH] DoS fix 
In-reply-to: Your message of "Fri, 07 Aug 1998 23:05:10 PDT."
             <Pine.LNX.3.96dg4.980807225856.29169E-100000@twinlark.arctic.org>
Date: Sat, 08 Aug 1998 18:26:03 -0700
From: "Roy T. Fielding" <fielding@kiwi.ics.uci.edu>
Message-ID: <9808081826.aa11746@paris.ics.uci.edu>
Sender: new-httpd-owner@apache.org
Precedence: bulk
Reply-To: new-httpd@apache.org

>This replaces the O(n^2) space with O(n), and the O(n^2) cpu time stuff
>with O(n*lg(n)).  The idea is simple -- read everything first, without
>merging, then sort based on the keys, then merge the values if required. 

Cute hack, but the HTTP protocol is designed for efficient handling
of header fields.  The problem is that our tables data structure sucks
for HTTP header fields (we've talked about that in the past).
If we are going to make any significant change to the parsing, then
I'd start with replacing the data structure with something that
doesn't require a copy on merge.

There are a bunch of ways to do this, but my guess (based on actual
header fields used being a sparse subset of those defined by HTTP)
is that a hash or tree-based tokenizer for reducing the field name
to an easy-to-compare number, combined with a balanced tree for holding
the field table (logN access) and a linked list for holding the value(s)
associated with each field table entry, is the way to go for solving
all of these problems in the long term.

The extra merge cost of trailers is unavoidable given the desire to
produce overall metadata after the data is produced, but without
batching the whole process.  This works just dandy if everyone can
handle chunked content, but it sucks for us because neither the proxy
nor the CGI stuff are capable of handling chunked, so the only way
for us to handle it properly is to buffer the data to a file.
But that is still better than holding up progress just because older
systems are not ready to partake in the benefits of new protocol features.

....Roy