apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <ad...@rowe-clan.net>
Subject Re: [REPOST] printf and FMT values.
Date Tue, 08 May 2001 18:43:27 GMT

----- Original Message ----- 
From: "Luke Kenneth Casson Leighton" <lkcl@samba-tng.org>
To: "Justin Erenkrantz" <jerenkrantz@ebuilt.com>
Cc: "Jeff Trawick" <trawickj@bellsouth.net>; <dev@apr.apache.org>
Sent: Tuesday, May 08, 2001 11:22 AM
Subject: Re: [REPOST] printf and FMT values.

> > > > There is no need for apr_*printf formats to be compatible with OS printf
> > > > calls.  We have re-implemented apr_*printf because we needed better
> > > > portability.  In reality, that means that we could easily just define
> > > > single set of format strings.
> > Well, the only glaring omission to apr_*printf() is support for "%lld" -
> the other one that you may wish to consider adding, at some point,
> is Unicode printfs.

How to make it platform independent?

Remeber byte ordering is a _significant_ problem across platforms.

Also consider there is an implicit conversion involved, it's not trivial,
and it isn't clean.

It is not trivial to keep clean crossplatform byte/word oriented code portable.
Believe me, I'm _very_ familiar with Win32 BSTRs and native types.  It's isn't
a joking manner when the user has to connect to an external 8-bit oriented
resource, sends non-mappable data at it, and expects to get back what they stow.

This is why I'm 'turning' apr in the direction of utf-8 mapping.  When we know
the author's intent (such as the ms kernel guys' winnt filesystem) we can try
exposing such unicode constructs in the 'internationalized' build.  

> Win32 has %S for 'hey, am i in UNICODE mode?  okay!  let's convert
> this string to char*, then!' and vice-versa when you compile _not_
> in Unicode-mode...

Completely bogus...

> otherwise, Win32 treats a %s as default-whatever-you-have-compiled-as
> [%s is Unicode when compiled as Unicode, char* when compiled as char*]

which is evil.  APR supports the 'good comprimize' approach.  We would never
presume a char isn't a char, since ANSI/POSIX both offer wchar to accomplish
this.  WinNT further presumes [for filenames today, all resources at some point]
that resource identifiers are unicode.  

Apache is forgiving, in the sense that it doesn't care what filesystem is under
the hood.  If the resource can be found, it can be authorized and served. 
Greg Stein has asked for some mechansim to distinguish and support local character 
set (8bit) naming, but nobody has offered up any reasonable comprimize to WinNT's 
flippent attitude to the bouncing around code pages.  On unix, a char value is a 
specific value and the file will always 'just open'.  On WinNT, this isn't the case.

I'm very loathe to associate any meanings between high-bit char-byte values and
some glyph.  If we must, later, we need to approach the entire apr library in a very
measured way.  Today, utf-8 filename mapping assures us we can 'see' any filename
on the user's volumes (unless they are on Win9x/ME ;->).  

But unicode mapping within printf?  Entirely non-portable.


View raw message