httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Stein <>
Subject Re: Bucket API cleanup issues
Date Tue, 27 Feb 2001 15:53:26 GMT
On Tue, Feb 27, 2001 at 09:54:25AM -0500, Cliff Woolley wrote:
> On Mon, 26 Feb 2001 wrote:
> > > 3) pool_bucket_cleanup() is completely bogus AFAICT.  I've added this
> > > comment to the code, which describes the problems pretty well:
> > > 4) The same problem applies to file buckets that have been split/copied
> > > when APR_HAS_MMAP: when one of them gets read it gets made into an MMAP.
> > > ALL of the file buckets should be converted at the same time to reference
> > > the same MMAP.
> >
> > I disagree about how to fix this.  The simple solution is to just abstract
> > on level, such as:
> >
> >   bucket   ->     bucket   ->   bucket
> >     |               |             |
> >   shared          shared	shared
> >     |               |             |
> >      -----------------------------
> >                     |
> >                 pool_bucket
> So is what you're saying that struct apr_bucket should be a member of TWO
> rings... one brigade as usual plus one ring ("brigade") of siblings
> buckets that point to the same resource?  I'd thought of that, but didn't
> think anyone would buy it.  I'm fine with the idea.  But I *must* be
> missing something... how does that keep us from traversing a list of
> buckets when we convert types?  (PS: Remember that the "shared" level is
> going away.)  If I've missed your point, please elaborate.

Euh... I don't think we want another ring.

A simpler idea is to have the apr_bucket_pool structure contain a pointer to
an apr_bucket_heap structure. At cleanup time, you do the following:

*) if h->heap_ptr is non-NULL, then use it for your new s->data field.
   do appropriate incref/decref for the two shared structures. toss the
   private pool stuff when refcount==0

*) if h->heap_ptr is NULL, then create a new apr_bucket_heap from the pool
   bucket. leave a ptr to this in the pool bucket. change self to point to
   the new heap and do the right incref/decref. zero-out the ptrs in the
   pool structure (they'll be invalid momentarily; this is for safety in
   case somebody tries to access them)

*) in the read() function (and others), if h->heap_ptr if non-NULL, then
   rebuild the bucket around the (shared) apr_bucket_heap structure. adjust
   incref/decref as appropriate.

So what you get here is a single allocation of a heap bucket. Then, you
lazily repoint all other buckets to the new heap bucket.

This does imply that the pool bucket structures must be malloc'd rather than
come from the pool (because the structs must live until the last reference,
even though the pool may disappear sooner).

Strictly speaking, you'll never get a read where h->heap_ptr is non-NULL.
When the pool is cleaned up, the first bucket hit will construct the heap
bucket. All the rest participating in the cleanup will get repointed. By the
end of the cleanup, nobody will be referring to the old pool substructure,
and it will go away.

But logically, it is better to make allowances for h->heap_ptr in case some
fool wants to read() during a cleanup.

> > > 6) Should mmap_destroy() call apr_mmap_delete(m->mmap) after when the last
> > > reference to the mmap bucket is destroyed?
> >
> > No.  If you do that, then caching MMAP's becomes impossible.  The
> > apr_mmap_delete should be left up to the pool cleanup.
> Okay, no problem.  I'd just noticed it when I was going through the code
> and thought I'd bring it up, but I wasn't married to the idea.  I figured
> it was left out for a reason but didn't know what it was.  Scratch #6.

A better approach is to have the cache insert the MMAP bucket with a
refcount starting at 2 (one for the cache, one for the bucket). When the
last bucket goes away, you'll still have a refcount of 1 and the MMAP will
remain alive.

When we auto-convert a FILE to an MMAP, it will start with refcount==1. This
allows the MMAP to be tossed when the bucket goes away. (pool cleanup could
be a *long* ways away)

This gives us immediate cleanup when possible, and deferred cleanup when
needed [by a cache].

> > > 7) socket_read() and pipe_read() should return APR_SUCCESS even in an EOF
> > > condition, as discussed on new-httpd a week or so ago.  I haven't made
> > > this change yet, though I have put an XXX comment in the code about it,
> > > because it requres fixing the input filters in Apache to figure out EOS on
> > > their own.  Looking for suggestions on how to do that.

If you do a blocking read on a **SOCKET bucket** for one byte, then you'll
end up with one of three situations:

1) EOF was hit (internally) and the socket is turned into a zero-length HEAP
   Apache doesn't do anything special to recognize this situation.

2) Timeout was reached. APR_TIMEUP is returned by the read operation. This
   is an actual error situation, so Apache handles it as appropriate.

3) At least one byte is returned in a HEAP bucket.

Apache will determine when to insert the EOS based on when the protocol says
the request has ended. This is based on chunking or Content-Length or
end-of-headers. The client cannot close the connection to denote the end

Note that situation (1) should only occur in one of two cases:

a) the client aborted the connection
b) attempting to read a second (or Nth) request, and the client closed it

Okay. The above is for reading an individual bucket. That is, the read()

Reading from the input filter **stack** is different. Let's say that case
(1) occurs and you've "read" the zero bytes out of the brigade. The brigade
is now empty, so you do an ap_get_brigade() for more. This drops all the way
down to the core_input_filter. It says, "sorry, I gave you everything I had"
and returns APR_EOF. *NOW* is where the EOS is detected.

core_input_filter can do one of two things:

 i) insert an EOS right after the SOCKET bucket at setup time. continue to
    return APR_EOF if the SOCKET bucket (and EOS bucket) is consumed and
    core_input_filter is called for more.

ii) don't return APR_EOF, but return EOS instead

I prefer the former. The upper layers cannot distinguish "end of this
request" from "there is nothing more" in the second case. They'll go for
more and keep getting EOS. But that will be a 400 (Bad Request) because it
looks empty.

In case (2) above, Apache tears down the connection, so EOS is never an

In case (3) above, you have data, so EOS is not an issue.

And when you're successfully fetching data, the EOS is inserted by the upper
level filters (HTTP_IN? DECHUNK? dunno which it is) when it determines the
logical request has ended (despite what the lower level filters may say
about more data being present).

The big gimmick is to not confuse return values from read() and those of
ap_get_brigade(). They have different semantics.

That should do the trick. Save this MsgID somewhere :-)


Greg Stein,

View raw message