httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul J. Reder" <rede...@raleigh.ibm.com>
Subject Re: Mod_include design
Date Wed, 25 Oct 2000 20:10:35 GMT
rbb@covalent.net wrote:
> > As far as I can tell this is just a restatement of what I said. A tag can span
> > brigades. The pertinent bytes from the first brigade need to be split and set
> > aside so that they can be concat'ed to the following brigade so that the matching
> > and processing can continue. If I am incorrect please help me understand.
> 
> You are correct, but that is already working supposedly, and there should
> be no reason to do any memcpy's to make this work completely.  The problem
> is that your design is an exact re-statement of the current design.
> 

First off, when I started working on this it wasn't yet hacked into mod_include so
we came to basically the same conclusion concurrently. Second, I don't know where
you are talking about memcpy's being done.

> But there is no state to save.  I'll explain.  If we have a brigade that
> resolves to:
> 
>   foobar <!-
> 
> then we want to split at < so that we have two brigades:
> 
> "foobar " and "<!-"
> 
> On the next call to the includes_filter, we just concat the two brigades,

Where was the first brigade squirreled away until the second invocation of mod_include?
Are you going to start checking for the STARTING_SEQUENCE from the beginning again,
or are you going to pick up where you left off? If you will pick up where you left
off (as my current code does) how do you know where to resume?

> and we will have either:
> 
> "<!-" "- garbage again"
> 
> or
> 
> "<!-" "garbage again"
> 
> Notice, concat doesn't have anything to do with memcpy.  Regardless, it is
> four compares to check the start tag and four for the ending tag.  Since
> we need to do the split and keep the data around regardless, it doesn't
> make any sense to also save a state in that structure.
> 

Again with the memcpy's. The concat that I spoke of was to concat two brigades.
If you are going to parse through a collection of buckets (or what eventually
gets concat'ed from multiple brigades into one brigade of buckets) why not tag
the interesting parts as you go? For instance, tag the starting point of the
SSI directive and compute its length when you run through the first time. This 
saves having to parse through a second time trying to skip the tag to get to
the directive. And before you say "It is just 4 bytes.", they may be spread over
4 buckets. If you already did the parsing work once, why repeat? Store the pointers
as state info that is maintained from one invocation to the next. The state is
just a small set of pointers and indexes, there is no copying until the directive
is copied into a contiguous buffer (and even this is not strictly required).

> The problem is in taking this peice-meal.  Fixing bucket handling without
> fixing the tag handling is not going to work.  The bucket handling is
> intimatly tied to the tag handling.  For example, if the tag is a set tag,
> it doesn't make any sense to send the data before we try to parse the tag,
> because the tag won't generate any data.  I would like the handle_foo
> functions to return a bucket brigade that will be inserted into current
> brigade.

Thank you for the specific example. This makes sense. I will work to rewrite
the parser and processor as an integral part of the brigade processing. I can
see the cases where just inserting on the fly would be appropriate. I would
still assume that for cgi directives the leading buckets would be sent on
since the processing could take a while. The cgi output would then be sent
along as its own brigade later. Reasonable?

> That's fine.  I was hacking in changes to get mod_include to work
> again.  I would still like to see somebody else re-write mod_include,
> because I really just don't want to do it.  However, if we are going to
> re-write it, we should be ripping out the guts and re-writing it, not
> conitnuing the hack that is currently in place.

Gut-ripping has commenced.
 
> Plus, this really should be implemented as a real parser, not the current
> hack we have.
 
I'm assuming you aren't looking for a full YACC/LEXX implemented parser for
this simple setup, just something cleaner and more formal than the kludged
string matcher that is currently there.

-- 
Paul J. Reder
-----------------------------------------------------------
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein

Mime
View raw message