Return-Path: Delivered-To: new-httpd-archive@hyperreal.org Received: (qmail 3767 invoked by uid 6000); 9 Aug 1998 01:40:48 -0000 Received: (qmail 3752 invoked from network); 9 Aug 1998 01:40:46 -0000 Received: from paris.ics.uci.edu (mmdf@128.195.1.50) by taz.hyperreal.org with SMTP; 9 Aug 1998 01:40:46 -0000 Received: from kiwi.ics.uci.edu by paris.ics.uci.edu id aa11746; 8 Aug 98 18:26 PDT To: new-httpd@apache.org Subject: Re: [PATCH] DoS fix In-reply-to: Your message of "Fri, 07 Aug 1998 23:05:10 PDT." Date: Sat, 08 Aug 1998 18:26:03 -0700 From: "Roy T. Fielding" Message-ID: <9808081826.aa11746@paris.ics.uci.edu> Sender: new-httpd-owner@apache.org Precedence: bulk Reply-To: new-httpd@apache.org >This replaces the O(n^2) space with O(n), and the O(n^2) cpu time stuff >with O(n*lg(n)). The idea is simple -- read everything first, without >merging, then sort based on the keys, then merge the values if required. Cute hack, but the HTTP protocol is designed for efficient handling of header fields. The problem is that our tables data structure sucks for HTTP header fields (we've talked about that in the past). If we are going to make any significant change to the parsing, then I'd start with replacing the data structure with something that doesn't require a copy on merge. There are a bunch of ways to do this, but my guess (based on actual header fields used being a sparse subset of those defined by HTTP) is that a hash or tree-based tokenizer for reducing the field name to an easy-to-compare number, combined with a balanced tree for holding the field table (logN access) and a linked list for holding the value(s) associated with each field table entry, is the way to go for solving all of these problems in the long term. The extra merge cost of trailers is unavoidable given the desire to produce overall metadata after the data is produced, but without batching the whole process. This works just dandy if everyone can handle chunked content, but it sucks for us because neither the proxy nor the CGI stuff are capable of handling chunked, so the only way for us to handle it properly is to buffer the data to a file. But that is still better than holding up progress just because older systems are not ready to partake in the benefits of new protocol features. ....Roy