db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-5959) Territory-based collation is not robust against changes in the collation rules
Date Tue, 23 Oct 2012 14:59:13 GMT

    [ https://issues.apache.org/jira/browse/DERBY-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482379#comment-13482379
] 

Knut Anders Hatlen commented on DERBY-5959:
-------------------------------------------

Since Derby only supports RuleBasedCollator, one solution could be to store the rules (RuleBasedCollator.getRules())
at database-creation time, and use that value to reconstruct the collator on subsequent boots
(using the RuleBasedCollator(String) constructor). This would also allow use of a database
with territory-based collation on a platform that doesn't support the specific locale, as
long as it was available on the platform where the database was created.

The downside of such an approach is that the database users won't automatically get the benefit
of fixes in the collation rules (such as the above mentioned http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6755060
bug) when upgrading the JRE.

Another approach may be to instruct users to drop and recreate all indexes that contain CHAR/VARCHAR
columns when switching to another JRE, but that may be impractical. Also, if the user fails
to do this, inconsistencies may sneak into the database.
                
> Territory-based collation is not robust against changes in the collation rules
> ------------------------------------------------------------------------------
>
>                 Key: DERBY-5959
>                 URL: https://issues.apache.org/jira/browse/DERBY-5959
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.10.0.0
>            Reporter: Knut Anders Hatlen
>
> When accessing a database with territory-based collation, Derby will use the collation
rules of the collator returned by Collator.getInstance(databaseLocale). However, there is
no guarantee that those rules are consistent across different JVM vendors and versions. This
means that the ordering could vary, and inconsistencies could sneak into the indexes.
> One example is that Oracle's JDK changed the collation rules for Turkish between Java
5 and Java 6, so if you run the following script
> connect 'jdbc:derby:memory:db;territory=tr_TR;collation=TERRITORY_BASED;create=true';
> create table t(c char(2));
> insert into t values 'ıa', 'Ia', 'ia', 'İa', 'ıb', 'Ib', 'ib', 'İb';
> select * from t order by c;
> you'll get different results on Java 5 and on Java 6 and later.
> Java 5 will order the results like this:
> ij> select * from t order by c;
> C   
> ----
> ıa  
> Ia  
> ia  
> İa  
> ıb  
> Ib  
> ib  
> İb  
> 8 rows selected
> Java 6 and later order them like this like this:
> ij> select * from t order by c;
> C   
> ----
> ıa  
> Ia  
> ıb  
> Ib  
> ia  
> İa  
> ib  
> İb  
> 8 rows selected

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message