apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Bloom <rbl...@gmail.com>
Subject Re-architecture for 2.0 tree
Date Thu, 20 Jan 2005 13:19:12 GMT
grrrrr,  should have sent this to the list.

Ryan


---------- Forwarded message ----------
From: Ryan Bloom <rbloom@gmail.com>
Date: Wed, 19 Jan 2005 15:05:10 -0500
Subject: Re: Re-architecture for 2.0 tree
To: "William A. Rowe, Jr." <wrowe@rowe-clan.net>


So, I took a few shortcuts in my original message, explained below.

On Wed, 19 Jan 2005 13:16:10 -0600, William A. Rowe, Jr.
<wrowe@rowe-clan.net> wrote:
> At 12:33 PM 1/19/2005, Ryan Bloom wrote:
>
> >One of my biggest mistakes when initially designing APR was that I
> >forced each platform to have completely distinct implementations for
> >simple functions if the structures were distinct.  This has led to a
> >great amount of duplicate code in the library, and makes it harder to
> >see where the differences between the platforms truly are.  I would
> >love to see APR 2.0 solve this problem.
>
> +1
>
> >In another header file:
> >#ifdef WIN32
> >struct arch_file {
> >    HANDLE filedes;
> >}
> >#else
> >struct arch_file {
> >    int filedes;
> >}
> >#endif
>
> I still like the concept of doing something like
>
> struct apr_foo_t {
>     apr_pool_t *pool;
>     void *      foodat;
>     struct apr_foo_arch_t arch;
> }
>
> and defining struct apr_foo_arch_t within an include/arch/xxx/foo.h
> file.  This gains you the transparency of common elements (pool,
> foodat, etc) while leaving certain elements opaque.  Because the
> arch element above is undefined, the structure remains incomplete.
> (We can't anticipate the maximum size for the object.)
>
> Right now, in 1.2.0, we could add an element onto the end of the
> apr_foo_t opaque structure, and not break abi.  Entirely transparent
> structures don't share that attribute, folks can determine the size
> of the complete, transparent structure in their code, and force
> minor version dependencies that don't exist today.
>
> The bit that disturbed me was long #ifdef litanies which we
> know (from apache-1.3/src/server/main.c) get illegible quickly.

Yeah, I know, this was a shortcut.  I honestly do expect those
arch_file_t type of structures to be hidden inside private headers and
to have them as incomplete structures, but I was too lazy to make that
clear, you caught me.

> >This can allow simple functions, like apr_file_get_name(),
> >to read the name of the file quickly (and only have one
> >implementation of that function), while still allowing native
> >functions to be used for the more complex work.
>
> Again +1.  I'd even suggest the structure
>
>   apr/foo/common/*.c
>   apr/foo/netware/*.c
>   apr/foo/unix/*.c
>   apr/foo/win32/*.c
>
> where we drop the concept of '/unix/' as the 'common' impl,
> and build apr/*/common/ on all platforms.

++1.  Using Unix as the common base was a hack, which I only got away
with because POSIX was renamed to UNIX98 at some point.   :-D

> >If we don't want the apr_file_t structure to be incomplete, we can
> >still provide the getters/setters for the internal structures, but
> >also open up our structures, so that people can get the file's name by
> >accessing fp->name directly.
>
> Huge risks, garbage collection and modifying associated fields.
> The more we let people diddle in the transparent elements, the
> less they recognize side effects that are only possible when
> they invoke the set'ter function.  This will vary based on the
> apr_xxx_t object we are talking about of course.

Yeah, but I am a firm believer that some stuff is private, and some
stuff isn't private.  The filename doesn't necessarily need to be a
private field.  We could change my original proposal to be:

typedef struct apr_file_t {
    char *name;
    apr_common_file_t *common;
    apr_arch_file_t *arch_specific;
}

Then the common stuff is hidden in an incomplete type, but some stuff
can be easily exposed.  Obviously not all structures would necessarily
need all three types of data.

> >This may also help us remove the depenance on pools throughout the
> >code.  My least favorite part of APR is that all allocation _must_ be
> >done through pools, even if pools don't make sense for your
> >application (I take full blame for that).
>
> Unfortuantely -we- perform allocations within our own operations
> on these objects.  When that happens, we have to look at the pool
> allocator.  I believe we can and should revisit allocator/free
> indirection.  Yes, a small performance hit will result, but it
> should not be substantial enough to invalidate the entire concept.

I think we should take a closer look at why _we_ allocate structures.
Or, we could make the allocator stuff a compile-time choice, which
would mean no performance hit.  Or, we could have functions to
allocate from a pool, and other functions that allocate using
malloc/free.  Lots of options.

> >With this model, end-users can allocate the space for apr_file_t
> >themselves, and all we need to do is figure out how to allocate
> >the arch_file structure correctly, a much smaller allocation,
> >and I believe we will find that we allocate less within APR and
> >allow developers to pass us pre-allocated structures.
>
> Again, garbage collection is the primary issue.  If the user goes
> and allocates our structures, how are our (critical) destructors
> then invoked?  And how are elements within the structure ultimately
> freed?  This part of the approach scares me.
>
> One alternative would be more reusable apr_xxx_t objects, which
> would ensure we don't hit some of these issues.  To be able to
> call an apr_xxx_clear() prior to reuse, or apr_xxx_free() that
> invokes the cleanups might help here, even if they are mostly
> no-ops until a destructor actually returns the memory.
>
> In other works, clearly segregate what c++ users refer to as
> a pre-destruction phase from the actual garbage collection,
> and the allocation from the initialization phase.  This would
> help you realize your concept of detaching us (somewhat) from
> pools all together.


I think this is going to be required.  The original design relied on
pools to do this stuff, but the 2.0 approach should include explicit
destruction methods that are no-ops when pools are used, but that
actually free memory and release resources when pools aren't used.
This part of my thoughts aren't well thought out yet, but I think
there is some wiggle room here.

I think you and I are on the same page, I just took some short-cuts
and didn't explain it as well as you did.   Thanks!   :-)

Ryan

--
Ryan Bloom
rbb@apache.org
rbb@redhat.com
rbloom@gmail.com


-- 
Ryan Bloom
rbb@apache.org
rbb@redhat.com
rbloom@gmail.com

Mime
View raw message