apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <...@covalent.net>
Subject Re: [PATCH] bucket type "capabilities" flags
Date Tue, 19 Jun 2001 17:03:43 GMT

> > What does this provide us?  We are talking about improving performance by
> > a single function call, and either way, we have to handle error
> > conditions from those functions.  The code is much easier to read and
> > understand if we just use the return code to determine that the function
> > doesn't exist.  Consider the following:
> >
> > 	if (bucket->flags & APR_BUCKET_FLAG_SETASIDE) {
> >             if (apr_bucket_setaside(...) != APR_SUCCESS) {
> >                 /* handle error */
> >             }
> >         }
> >
> > When compared to:
> >
> >         if (apr_bucket_setaside(...) != APR_SUCCESS) {
> >             /* handle error */
> >         }
>
> I'm imagining some situation where you want to make sure ahead of time
> that all of the buckets in a brigade support, say, setaside().  If any one
> of them doesn't, then that's really bad news, and you want to punt to a
> backup algorithm before you even start trying to setaside() any of them.
> For a single call, you're right -- there's no need to check ahead of time.
> It's only in these "big picture" situations that you care.

At this point, you are trying to second guess the bucket type.  If the
bucket doesn't implement setaside, then it doesn't need setaside.  If we
are going to look through an entire brigade before we run through it
calling a function, then we have to run the brigade twice.  We are better
off just running through the brigade once, calling the functions.

> > > "metadata" bucket, and whether the bucket has a file descriptor in it for
> > > the use by apr_sendfile().
> >
> > metadata buckets might be useful, but I am having a hard time seeing it.
> > Do we really expect more modules to implement their own metadata buckets?
> > Does it matter if they do?  By definition, all buckets implement read
> > functions.  A metadata bucket is simply a bucket that always returns no
> > data from read.  If a module implements their own metadata bucket, then
> > that module needs to be the thing to handle that bucket type, so I am
> > missing the utility of the metadata flag.
>
> I don't know.  Maybe in one of those situations like the example I gave
> above, you want to be sure that all DATA buckets support setaside.  If a
> bucket doesn't support setaside and it's a metadata bucket, you could care
> less -- just skip over it.  Anyway, I threw this one in because it seemed
> like a good idea at the time... maybe it's not that useful.  <shrug>

I don't believe it is useful, but I could be wrong about that.

> > > can be sendfile'd ONLY if it matches APR_BUCKET_IS_FILE(), though the
> > > above is clearly a case where an APR_BUCKET_SUPPORTS_SENDFILE() test would
> > > be much more useful and extensible.  The flags field gives us that
> > > ability.
> >
> > I am severly missing how this file works.  If it can only be sendfile'd,
> > then we are basically saying that no files in the file cache can be sent
> > through a filter.
>
> That's correct--almost.  In fact, that's the whole problem we're trying to
> solve.  Right now, mod_file_cache has to assume that if ANY content filter
> is on the stack, then it must DECLINE the request and punt it over to the
> default handler because that filter might try to read from its file
> descriptor, which it cannot support.  However, there are several pitfalls
> in that logic.  The main problem is that there might be a "benign"
> content filter out there (something like content_length or byterange,
> perhaps) that looks like a content filter but actually is never going to

Those aren't benign.  Content_length has every right to call read, or
setaside on a bucket.  So does byterange.

> read the buckets as long as their length is known ahead of time.  So
> currently mod_file_cache can never be used on a request with one of those
> filters in the chain, even though there's no reason why it wouldn't work.
> The second problem is that the check for content filters that
> mod_file_cache uses assumes that it can just look at the topmost filter in
> the stack; that filter must be a content filter if any filters in the
> stack are content filters.  If there are no content filters in the stack,
> it assumes it is safe to pass its shared file descriptor down.  But these
> assumptions seems very easily broken... the whole thing is just too
> fragile.  Using a custom bucket type is much, much more robust and fixes
> all of these problems.
>
> The short version of how it works is that mod_file_cache sticks its
> apr_file_t into one of these cachedfile buckets and sends it down the
> stack.  The cachedfile_create function also takes as parameters the
> filename and a pool.  If any filter ever tries to read from the cachedfile
> bucket, the cachedfile_read() function reopens the file into the pool that
> was passed in at creation-time (namely, r->pool), morphs the bucket into a
> real file bucket, and reads from that.  So we're no worse off than if we'd
> punted to the default handler in the first place--all it would have done
> is open the file into the request pool and create a file bucket to put it
> in, which is exactly what we've done--but ONLY if we absolutely have to.
> See?

Yes, I see how it works, but I don't believe this is the correct solution.
How did you open the file originally?  Did you use apr_file_open, or did
you use native calls?  If you used apr_file_open, then APR already has
enough information to do exactly what you are doing without the custom
bucket type.  My concern, is that you are building logic for when a file
can be read and when it can't into a custom bucket, but that logic is
actually very important for APR to have.  This does make the file_bucket
slightly more complex, because now that bucket type has to handle a new
error from apr_file_read(), but it should reduce the amount of duplicate
code, and it removes the requirements for the flags.

> If you want more detail, I'll post the patch to mod_file_cache in a
> separate thread.  It's a work in progress at the moment, but there's
> enough of it there that you should be able to see what we're trying to
> accomplish.

I do understand what you are trying to accomplish, but I disagree with the
method you are trying to use.

> > Support for limiting what can be done with a
> > file descriptor, should be a part of the apr_file_t structure.  So, if you
> > are opening a file to be shared across multiple threads, then that
> > information should be stored in the apr_file_t, and then the read/mmap
> > functions should look at the apr_file_t and realize that they can't
> > operate on that file.  Once that work is done, I don't understand the need
> > for the sendfile flag.
>
> But that would mean that an attempt to read from one of these regular file
> buckets with a "special" file handle in it would fail.  That would be very
> bad.  The right way to do it is that the bucket should just know how to
> handle the situation, which is to give the thread its own private file
> handle to the file (by reopening it) if and only if the particular request
> really calls for it.  This is very different from the actions of regular
> file buckets, so a custom bucket type works quite nicely.  All it has to

You are trying to second guess the filters, which is not a good idea,
IMNSHO.  By just folding the logic into file buckets, you remove a lot of
the special cases that you have mentioned above.  In the end, the complex
logic ends up in the file_bucket, but filters can be relatively simple.

> do is make sure that it keeps its FD at the same offset as the regular
> file bucket does, and watch out for any calls to read().  Sure, this logic
> could be stuffed into the regular file bucket, but the regular file bucket
> logic is complicated enough as it is.  Special type of data storage,
> special bucket type... it's a lot cleaner that way.  Besides, who's to say
> that somebody else won't come along later and have yet ANOTHER bucket type
> that has a file descriptor in it that they want to sendfile()...

That's hand-waving.  I could say that somebody can come up with a bucket
that requires that I allow pipes or sockets to be sent through sendfile,
but that doesn't mean I have actually done it.  We shouldn't be trying to
design for every possible eventuality.  Design for what we need today, and
extend it when we need more functionality.

> Well, like I said, the main reason I need this is for the sendfile thing.
> I'm sure that there are other cases that will arise later on where
> somebody (possibly a 3rd party module developer) will want to use a custom
> bucket type and will want to have his bucket type get handled by the core
> in some special way (eg this sendfile situation).  In any of those cases,
> it's better to have a flag that says "this bucket supports this special
> function" than to have a hardcoded test for one of the canonical bucket
> types which would require hacking the core to get around.  If I can't
> convince you of the need for flags that indicate which functions are
> implemented (which I hope I already have), then hopefully I can at least
> convince you of this.

I disagree that the flags for function pointers are a good idea.  I also
disagree that trying to figure out all the flags that could allow the core
to handle all buckets optimally is a good idea.  The core code is designed
to handle the core bucket types optimally.  Trying to allow the core to
optimize sending any possible bucket type is a losing battle in my
opinion.  This doesn't mean we shouldn't make an attempt, but it does mean
that we don't have to bend over backwards.  The current method we use to
support extended bucket types is to force all buckets to implement a read
function.  Maybe we want to do more, but let's at least state that our
goal in doing this is to allow the core to optimize sending ALL possible
bucket types.  If the goal is anything less, then I would suggest that we
take the simplest solution, which we have already done with the read
requirement.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------



Mime
View raw message