db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel John Debrunner (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-2336) Enable collation based ordering for CHAR data type.
Date Tue, 06 Mar 2007 19:48:24 GMT

    [ https://issues.apache.org/jira/browse/DERBY-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12478538

Daniel John Debrunner commented on DERBY-2336:

Some of my thoughts in this area are biased by the existing implementation of (not-exposed)
national character types, which was originally done quickly and in my opinion rather poorly.
Typically in software engineering the second re-write is an improvement on the first, it seems
there is an opportunity here to make a second re-write better rather than repeat the mistakes
of the first.

Some of the mistakes of the first are:
  - using the context to obtain the locale (even commented as a HACK)
  - pushing the locale ordering code into the base class (SQLChar) so that non-locale based
ordering has to deal with additional overhead
  - basing the ordering on a locale and not a Collator

I think there is also plenty of opportunity to make incremental progress here. Steps I could
see happening are:

  1) Create the new character data value classes that perform ordering based upon a Collator.
Have the collator be a field
        in the class and hard-code it to some language for now (say Norwegian :-)
  2) make those classes the ones used when locale based ordering is required.
  3) write some tests that ensure the order does change when the collation property is set
on database creation
     (can be converted into real tests later by using a Norwegian database)
   4) Delete the old national character types and remove their ovverhead from the USC_BASIC
ordering classes (SQLChar etc.)

   5) Figure out how to set the real Collator object for a DataValueDescriptor during runtime
and  recovery.

Of course step 5) doesn't have to be done last, it's independent of steps 1-4

   6) Write more tests for other languages.

I agree that some of the locale issues with data types are being confused here, Mamta found
a valid bug where conversion of string to
a date time value is not being handled correctly. That is a separate issue, but it's being
confused because the discussion so far has 
not really described the actual problems, it's focussed on getting the locale based upon the
old national character types code.
The real problems here are:

   1) How to get the correct Collator object for character comparisions
   2) How to get the correct object to parse date-time values from Strings

1) Is just locale based for this issue, but there is the chance to have a framework that works
with more than locale based ordering,
such as case-insensitive ordering. Focusing on the locale increases the likelyhood that the
solution will not be expandable to other types.
E.g. if we duplicate code for locale in the store, do we need to duplicate code again for
case-insensitive searches, and then again for
another Collator style?

2) is easier because it doesn't need to worry about recovery and is more closely related to
the solution used for the Calendar object.

> Enable collation based ordering for CHAR data type.
> ---------------------------------------------------
>                 Key: DERBY-2336
>                 URL: https://issues.apache.org/jira/browse/DERBY-2336
>             Project: Derby
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions:
>            Reporter: Mamta A. Satoor
>         Attachments: DERBY_LocalFinder_CodeCleanup_diff_V01.txt, DERBY_LocalFinder_CodeCleanup_stat_V01.txt
> I am breaking down the Parent task DERBY-1478 (Add built in language based ordering and
like processing to Derby) into multiple sub tasks. One of them is to concentrate on enabling
the collation based ordering on (hopefully the simplest of all the character data types) CHAR
data type. This task in itself might need subtasks if it is later found that it can be subdivided
into multiple smaller steps.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message