From "William A. Rowe, Jr." <wr...@rowe-clan.net>
Subject Re: apr-iconv 1.0
Date Wed, 30 Mar 2005 08:42:17 GMT
At 12:36 AM 3/30/2005, Curt Arnold wrote:

>On Mar 29, 2005, at 11:13 PM, William A. Rowe, Jr. wrote:
>>At 10:09 PM 3/29/2005, Curt Arnold wrote:
>>>I'm unclear on the proposed resolution.  Does it include a mechanism to distinguish
apriconv-1 encoding modules from apriconv-0 modules either by changing the module signature
or name?
>>I believe we should do that, too.  But it seems that some want
>>us to use an 'external' iconv implementation, at least, as of
>>the next major release of apr-util.  With the files residing
>>in /apriconv-1/*.so I'm less concerned with the signature.
>Are you expecting the apriconv-1 directory to be on the PATH or LD_LIBRARY_PATH 


>or relative to a directory on the PATH or LD_LIBRARY_PATH?  

Well the directory apriconv-1 will reside in a directory pointed
to by PATH/LD_LIBRARY_PATH/SHLIB_PATH.  And (at least on win32) 
within the cwd as we also discover binaries that way.

The theory is simple.  We -found- libapriconv-1.so/.dll.  We'd
already loaded it.  How?  Because it resided in exactly those 
paths, or the subdirectory of the startup dir on win32.

The trivial rule is that libapriconv-1/ directory must sit beside

>If it is on the PATH or LD_LIBRARY_PATH, then this is the continued possibility of collision
with modules from other iconv implementations, if it is relative to an directory on the path,
then I would expect that you would need add a decent amount of complexity to accomplish that.

Not really.  We concatenate two strings, or three.

>I would think it may be simpler to have modules named apriconv-1-ENCODING.so or .dll and
have dlopen or LoadLibrary find them with their existing library search mechanisms, but I
could understand if that was undesirable for other reasons.

Hmmm.  String concats are string concats.  Visually, I believed
a directory libapriconv-1 was more clear.  I don't think adding
the extra filename information buys us anything, since -all- iconv
implementations have always shared the concept of a separate dir
for loadable modules.

>>>Without some mechanism to distinguish encoding modules, it really doesn't matter
what order the paths are evaluated?  If there is a mechanism, it shouldn't hurt to evaluate
>>Of course it would, because we would still have to iterate
>>multiple .so files to find the correct one.  This is a huge
>>performance hit.
>>So let me inquire - have you authored a custom (charset).so file,
>>or intend to do so?
>No, I have no intention to do that.  At this point, the resolution to the issue does not
effect log4cxx since it will continue to hack apriconv to embed the encodings into log4cxx.dll
and liblog4cxx.so.  
>You need to do what is right for httpd and/or Subversion.

Ack.  And I hope - do something that log4cxx ultimately adopts.

>>Again, the consensus on list was that apr-iconv
>>was an implementation-internal detail of apr-util.  With the next
>>release, implementors/users are expected to install the directory
>>apriconv-1 in parallel to apriconv-1.so [.dll].  APR_ICONV_PATH
>>would be searched last, to pick up those exceptions/custom charset
>>modules.  We wouldn't hit iconv/ at all (unless the user asks us
>>to through APR_ICONV_PATH.
>Since Subversion sets APR_ICONV_PATH, there is still the possibility of finding an apriconv-0
encoding module if the installation is damaged (or the developer decides not to distribute
some obscure encoding) or a new encoding gets added to Subversion and/or apriconv-0.

Right, except that we would seek APR_ICONV_PATH last.  Minimize
the chance of a collision.  But certainly, not exclude every

>>So I'm confused if this solution doesn't resolve the issues of all
>>users.  Please explain further.
>>On your point of avoiding conflicts by changing the signature, yes
>>I would be -happy- to entertain a patch, if you want to post one in
>>the next day.  I'll be putting together my changes tomorrow eve,
>>and would be easily able to integrate your suggestion if you care
>>to offer the patch.  I don't see it as a substitute for searching
>>for the right file, but I see it as added security against loading
>>the wrong module.
>By changing the signature, I meant changing the value of ICMOD_UC_CCS and ICMOD_UC_CES
(currently 0x100 and 0x101).  


>I don't know if this method of module identification is used elsewhere and if so, what
other values are currently in use.  If the encoding modules can't be distinguished by name,
I think it is essential that they have a different magic value.  Otherwise, you are just reducing
the chances of a catastrophic failure by checking APR_ICONV_PATH last, not eliminating it.


>Putting myself in the position of the developer of some hypothetical app (and I couldn't
have an iconv2 based solution), I would prefer that apriconv-1 encoding modules be named something
like "apriconv-1-utf-8.dll" or "libapriconv-1-utf-8.so" and located using the established
library search algorithms in dlopen or LoadLibrary.  In deployment, these would likely be
placed in the same directory as aprutil-1.dll or libaprutil-1.so or the executable if linked
against a static aprutil-1.  APR_ICONV_PATH could be ignored.  If the user really wanted a
new encoding, they could either place it in an already searched directory or add the directory
>However, it would be good if we could get some comment from a Subversion developer.

Agreed, and Branko's watching this thread.


