httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Stein <gst...@lyra.org>
Subject Re: cvs commit: apache-2.0/src/lib/apr/include apr.hw
Date Thu, 02 Nov 2000 18:15:22 GMT
On Thu, Nov 02, 2000 at 11:30:13AM -0600, William A. Rowe, Jr. wrote:
> > From: Greg Stein [mailto:gstein@lyra.org]
> > Sent: Thursday, November 02, 2000 11:15 AM
>...
> > Bill, where is the runtime query function to determine how 
> > APR was built?
> 
> I'll step out of the war and let you and Ryan discuss amoungst yourselves.
> I'm +1 for a function to determine build options (all of them).  The compile
> time flag is in apr.h.in / apr.hw, along with many more HAS_ macros that
> we could query.  But my understanding is that this isn't resolved, some feel
> it was discussed on the list and vetoed.  I concur that run-time query is
> worthwhile, but let those who disagree bring me up to speed.

I don't recall seeing discussion on the list about this. Are you referring
to something from a long while back, or was it in the past week (and I just
haven't seen it yet)?

The HAS_ macros (or their APR_HAS_ variants) are compile-time. They aren't
enough, as I explained with the Expat case.

> > As we discussed at ApacheCon, if you alter the semantics of the functions
> > (i.e. from native 8-bit encodings to UTF-8 encoding), then you must have a
> > runtime query function.
> 
> You won't.  That is to say, I don't see us ever flipping between the 
> APR_HAS_UNICODE_FS, and not.  To do this on some unixes, perhaps we flip
> by volume, but I'm suggesting that Win32 builds will always use this case.

If you introduce changeable semantics for a function, then the APR user must
also change to meet those variable semantics. You're pushing a problem out
of APR onto the client.

If I write my little cross-platform app, but then find that I can't just
pass in my latin-1 strings when APR is compiled in a certain way, then I'm
going to have to write a bit of conditional code (possibly conditionally
compiled) that will remap those buggers into UTF-8.

It doesn't matter when you say "Win32 builds will always use this case." APR
programs are cross-platform. My code that uses APR will have to adjust to
the varying semantics (either within different builds on the same OS, or as
it gets compiled on different OSs).

>...
> > Dropping that flag into apr.h is not enough. Consider a Linux 
> > distro with a
> > couple RPMs which have been built. What happens if they are 
> > built against an
> > APR with and an APR without the UTF-8 semantics? One of them 
> > is going to
> > break at install time.
> 
> Why?  Please cite an example.

Sure.

I compile Subversion against a UTF-8 version of APR because we transport
filenames via XML, thus we UTF-8 encode the filenames. I ship my binaries.

I compile Apache against a native-encoding version of APR because all the
config files use native encoding. I ship my binaries.

Somebody ships an RPM of APR and that is installed on my machine. It was
just a library (the header comes with the "libapr-devel" package).

Okay... now answer this: which app breaks? Subversion or Apache? How was APR
compiled?

[ if you say, "ship two RPMs for APR", I will truly break your legs :-) ]

> > Another alternative that we discussed at ApacheCon is to have different
> > entry points, each with fixed (non-compile-time-changable) semantics. At the
> > conference, I was pretty ambivalent about which to choose, but I'm not
> > leaning towards two entry point, rather than a single entry point that can
> > change semantics on people.
> 
> I'm against seperate entry points for this reason.
> 
> Unicode OS's (WinNT is one) have a single, native wchar.  char's are just
> accedental remnants.

They are not accidental. They are an important part of all applications
today. We cannot simply say "oh, they are historical; we are going to ignore
them." That is bogus before you even complete that thought.

> However, IP and other cross platform apps require
> something a WHOLE lot more predictable than shifting endianness, and we
> don't want multiple code paths anymore.  To APR apps, it's all opaque with
> the exception that you can get an APR_EBADCH error from an open call, etc.

It is entirely possibly that my latin-1 filename will be misinterpreted as
valid UTF-8. That isn't right.

> APR isn't released, and we don't have some arbitrary mandate to provide
> compatibility with APR/Apache 2.0a1.  If we do provide a single entry point,
> then we are in shape to accomodate both worlds without multipathed code.

We can certainly do whatever we'd like with the API, but we have to make it
reasonable and usable.

> Please provide an example where these file semantics aren't opaque.

I get the filename from over the wire. The wire protocol implicitly or
explicitly defines the encoding of the filename. Whatever it is, it may not
match the semantics that APR was compiled with, so my client code is going
to have to manage a translation in certain cases.

Two entry points means that I call the function that has the correct
encoding semantics.

> > So... this Unicode stuff is incomplete until you have a 
> > runtime query (-0) or separate entry points (+1).
> 
> I agree it's incomplete, that's why it's in the tree, to move forward :-)

My point was simply that this issue must be handled before we can call it
"done" and ship it.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Mime
View raw message