db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: Language based matching
Date Tue, 11 Jul 2006 15:43:57 GMT
Hi Kathey,

Here is my understanding of how the disabled national string types worked:

1) A national string type used the collation ordering appropriate to the 
locale of the database. That collation ordering, in turn, was specified 
by the jdk and could not be overriden.

2) The collation ordering determined the meaning of <, =, and > for 
national strings. For a given locale, the rules can be quite tricky. If 
you're not familiar with a locale, you are likely to be surprised by the 
visibly different strings which nevertheless turn out to be = to one 

3) The locale-sensitive meaning of <, =, and > affected the operation of 
all orderings of national strings, including sorts, indexes, unions, 
group-by's, like's, between's, and in's.

At one point I was keen on re-enabling the national string types. Now I 
am leaning toward implementing the ANSI collation language. I think this 
is more powerful. In particular, it lets you support more than one 
language-sensitive ordering in the same database.

You and your customer face a hard problem trying to migrate national 
strings from Cloudscape 5.1.60 into Derby 10.1.3 or 10.2. I'm at a loss 
how to do this in a way that preserves Cloudscape's performance.


Kathey Marsden wrote:

> Bernt M. Johnsen wrote:
>> "aa" as one letter was removed from the Norwegian language in 1938 ("å"
>> had been optional since 1917). It is only used in names today and it is
>> true what Anders says about the phonebook (also about the foreign names
>> where "aa" is treated like two letters). I don't think it would be wise
>> to not let "a.*" match "Aasen" (wich in modern writing would be Åsen).
> Thank you so much Knut Anders and Bernt for the clarification on 
> "aa".  I  guess now I need a new example and need to understand how  
> Locale specific LIKE  processing is functionally different than 
> regular like behavior and  when it is required.
> The user I have been working with  is actually migrating from 
> Cloudscape 5.1.60 National Character types and the goal was to get a 
> workaround to achieve the same behavior in Derby.  The example came 
> from the doc:
> http://publibfi.boulder.ibm.com/epubs/html/cloud51/doc/html/coredocs/sqlj105.htm#1178996

> Clearly the Derby code  still has the code path for the National Type 
> special processing.
> In org.apache.derby.iapi.types.SQLChar  We have a separate code path 
> for National Character types that passes the Collator.
> How is this functionally different than  LIKE processing for regular 
> character types?  Can anyone think of another example where this 
> special processing might be needed?
> Thanks
> Kathey
> Below is a SQLChar code snippet for reference.
> public BooleanDataValue like(DataValueDescriptor pattern)
>                                throws StandardException
>    {
>        Boolean likeResult;
>        if (! isNationalString())
>        {
>            // note that we call getLength() because the length
>            // of the char array may be different than the
>            // length we should be using (i.e. getLength()).
>            // see getCharArray() for more info
>            char[] evalCharArray = getCharArray();
>            char[] patternCharArray = ((SQLChar)pattern).getCharArray();
>            likeResult = Like.like(evalCharArray,
>                                   getLength(),
>                                    patternCharArray,
>                                   pattern.getLength());
>        }
>        else
>        {
>            SQLChar patternSQLChar = (SQLChar) pattern;
>            likeResult = Like.like(getIntArray(),
>                                   getIntLength(),
>                                    patternSQLChar.getIntArray(),
>                                   patternSQLChar.getIntLength(),
>                                   getLocaleFinder().getCollator());
>        }

View raw message