subversion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Åkesson <tho...@akesson.cc>
Subject Re: Let's discuss about unicode compositions for filenames!
Date Sun, 12 Feb 2012 15:47:45 GMT

On 11 feb 2012, at 13:10, Hiroaki Nakamura wrote:

> Hi,
> 
> 2012/2/9 Thomas Åkesson <thomas@akesson.cc>:
>> Hi,
>> I have been interested in this issue for a couple of years and I remember it was
discussed briefly at Subconf in Germany a couple of years ago.
>> 
>> Branching the thread here because I'd like to propose a different approach than Hiroaki.
This proposition is not very different from the note "unicode-composition-for-filenames" or
what Peter S, Neels and others suggested, perhaps just combining 2 changes slightly differently.
>> 
>> This is based on my limited understanding of WC-NG, please correct me if I make incorrect
assumptions.
>> 
>> - Server will still accept both NFC and NFD, however, it will no longer accept collisions.
Enforced by normalising to NFD before uniqueness checks during add operations (yes, might
be more expensive). There will be no unified normalisation, but the subversion server will
work like most filesystems; return what was given to it.
> 
> For compatibility, we cannot ignore existing repositories and working
> copies which have filename
> collisions. So we cannot enforce subversion servers and clients to
> normalize filenames.
> We must let users to choose whether filenames are normalized or not
> per repository.
> 

Perhaps I did not describe this well enough, but I am _not_ suggesting a normalized repository
storage, just normalized uniqueness check during add operations. I believe that a normalized
repository storage would cause too much compatibility issues with historical data (as well
as other negative effects noted below). 

The proposition I outlined has _no_ issues what so ever with existing repositories or working
copies, even if they do have name collisions (which we all agree is rare). What  would change
is the ability to create _new_ name collisions (normalized) while old name collisions could
be resolved with 'svn mv'.

I am not sure anyone has yet voiced the opinion that Subversion must continue to accept the
creation of new name collisions. Anyone? I think Neels was closest to that opinion that but
my interpretation is that he suggested that a Subversion server should not normalize. The
more times I read Neels' post (2012-01-30), it is increasingly obvious that what I proposed
is very similar.

There is consensus that a high priority for Subversion is compatibility. Introducing a normalization/translation/etc
is risky business for compatibility. The HFS+ file system has been chastised (both here and
other dev-lists) for its behaviour. A file system is expected to return exactly what was stored,
or refuse up-front. 

Would it make sense to formalize the different approaches into a couple of RFCs attempting
to summarize the respective implications of each approach? I could try to write one up for
the "Non-normalizing approach". 

/Thomas Å.





Mime
View raw message