Received: by taz.hyperreal.com (8.6.12/8.6.5) id SAA12726; Thu, 30 May 1996 18:35:50 -0700 Received: from battra.telebase.com by taz.hyperreal.com (8.6.12/8.6.5) with ESMTP id SAA12719; Thu, 30 May 1996 18:35:46 -0700 Received: from wormhole.telebase.com by battra.telebase.com id VAA23550 for ; Thu, 30 May 1996 21:35:44 -0400 (EDT) Received: from spudboy.telebase.com (chuck@spudboy.telebase.com [172.16.2.215]) by wormhole.telebase.com (8.7.3/8.6.9.1) with ESMTP id VAA00483 for ; Thu, 30 May 1996 21:35:40 -0400 (EDT) Received: (from chuck@localhost) by spudboy.telebase.com (8.6.12/8.6.9.1) id VAA02188 for new-httpd@hyperreal.com; Thu, 30 May 1996 21:35:39 -0400 From: Chuck Murcko Message-Id: <199605310135.VAA02188@telebase.com.> Subject: Re: An idea for state saving To: new-httpd@hyperreal.com Date: Thu, 30 May 1996 21:35:39 -0400 (EDT) In-Reply-To: <199605302228.SAA01543@volterra.ai.mit.edu> from "Robert S. Thau" at May 30, 96 06:28:06 pm X-Mailer: ELM [version 2.4 PL24] Content-Type: text Sender: owner-new-httpd@apache.org Precedence: bulk Reply-To: new-httpd@hyperreal.com Robert S. Thau liltingly intones: > > What's stopping the query output from being written to, say, 10 pages of > 10 results each, all linked with numbers and Next/Prev. Point the user > to the first page and there you go. > > Hey, *any* common resource which can be safely written and accessed by > all of the web server child processes could be used this way, given a > little coding, though the file server probably is the path of least > resistance. As to what Alta Vista does, or how, I don't honestly know, > and I'm a bit curious... > Me too. All I've read is that part of the search performance is due to brute force - 6 or so Gb of RAM contains the database index. It's brute force, and expensive, but it works. How they generate their pages for final delivery is still unknown to me. Their pages are persistent, because I can set my browser cache to zero and still move back and forth among the various pages delivered from the query. I'd suspect something like a very fast, vary large file server or shared disk array. > This is not multiple simultaneous updates to the database, is it? Just > queries, right? > > Yep... multiple queries to a common database back end can be a problem. > F'rinstance, let's say that you have multiple server processes talking > to some kind of back end (database, search engine, whatever) through > a common pipe. One of these server processes writes a query to the > pipe. Subsequently, it reads a result. However, this may not be the > result of the query it made --- it could be the result of another query > which another child sent simultaneously down the pipe. > > (You can detect these situations by stamping IDs on the queries and > results, but then it gets really tricky to deliver the misrouted > response to the child process that asked for it... and all of this > gets even more fun if the queries are written in pieces, and get > intermixed in the pipe). > Yes indeed. The classic multiplexer approach. It *is* tricky, but least expensive from a hardware standpoint, generally speaking. You put your $$$ into one (or several) extremely fast channels to the back end database engine. > One way of dealing with this is by just making sure that each child > has its own channel to the back end (e.g., opens its own socket, in the > cases of the ILU requester and FastCGI). > I'd venture to extend this model one layer deeper. One difference from your description would be that the database engine is two layers back from the front end httpd machines, and the intermediate layer of machines actually does the assembly of on-the-fly generated pages from both static content they'd have locally (not necessarily HTML, but perhaps SGML) and the pointers returned from the database queries, each of which gets a channel into the database. Bulk static content is provided to the httpd machines in front from a disk farm shared with the second tier machines. It'd be darned expensive, but it would scream. chuck Chuck Murcko N2K Inc. Wayne PA chuck@telebase.com And now, on a lighter note: Brook's Law: Adding manpower to a late software project makes it later