Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 87277 invoked from network); 6 Feb 2007 15:02:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 6 Feb 2007 15:02:29 -0000 Received: (qmail 8297 invoked by uid 500); 6 Feb 2007 15:02:36 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 8099 invoked by uid 500); 6 Feb 2007 15:02:35 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 8090 invoked by uid 99); 6 Feb 2007 15:02:35 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Feb 2007 07:02:35 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Feb 2007 07:02:27 -0800 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id F021C7142BF for ; Tue, 6 Feb 2007 07:02:06 -0800 (PST) Message-ID: <17853160.1170774126980.JavaMail.jira@brutus> Date: Tue, 6 Feb 2007 07:02:06 -0800 (PST) From: "Mamta A. Satoor (JIRA)" To: derby-dev@db.apache.org Subject: [jira] Commented: (DERBY-1478) Add built in language based ordering and like processing to Derby In-Reply-To: <11092987.1152139470894.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/DERBY-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470603 ] Mamta A. Satoor commented on DERBY-1478: ---------------------------------------- Rick, I looked at SQL specification(Part 2) regarding SQL identifiers. For background, some general information on SQL identifiers from SQL spec if as follows 1)As per SQL specification Part 2, Section 4.2.4, the character repertoire for sql identifiers, SQL_IDENTIFIER, consists of Latin characters and digits,and all the other characters that the SQL-implementation supports for use in . After this, everything else related to SQL_IDENTIFER character repertoire is defined as implementation-defined. To be specific, 2)Section 4.2.5, Character encoding form, Pg 22 says SQL_IDENTIFIER is an implementation-defined character encoding form. It is applicable to the SQL_IDENTIFIER character repertoire. 3)Section 4.2.6, Collation, Pg 23, says SQL_IDENTIFIER is an implementation-defined collation. It is applicable to the SQL_IDENTIFIER character repertoire. 4)And lastly, in Section 4.2.7, Character Sets, SQL_IDENTIFIER is a character set whose repertoire is SQL_IDENTIFIER and whose character encoding form is SQL_IDENTIFIER. The name of its default collation is SQL_IDENTIFIER. 5)Section 4.2.3.1, Pg 19, talks about case folding. is a pair of funtions for converting all the lower case and title case characters in a given string to upper case or all the upper case and title case characters to lower case. A lower case character is a character in the Unicode General Category class "Ll" and upper case character is a character in the Unicode General Category class "Lu". >From the information above, we see that SQL specification leaves CEF and collation for SQL identifiers as implementation-defined but I donot see it saying specifically that case folding as implementation-defined. Even the section 4.2.3.1, Pg 19, second paragraph, talks about converting case in a generic manner in the context of UNICODE and not English locale. So, I am not sure why Derby/Cloudscape chose to use English locale to do case conversion of SQL identifiers. Derby's StringUtil class, where the SQL case conversion code lies, has following comment // The functions below are used for uppercasing SQL in a consistent manner. // Cloudscape will uppercase Turkish to the English locale to avoid i // uppercasing to an uppercase dotted i. In future versions, all // casing will be done in English. The result will be that we will get // only the 1:1 mappings in // http://www.unicode.org/Public/3.0-Update1/UnicodeData-3.0.1.txt // and avoid the 1:n mappings in //http://www.unicode.org/Public/3.0-Update1/SpecialCasing-3.txt // // Any SQL casing should use these functions Dan, you mentioned in one of your comments to this Jira entry that "Currently the uppercasing of SQL statements and identifiers is fixed as English to avoid unexpected issue with other languages". Can you please explaing what you mean by unexpected issues? Is that the same reason for recommending same behavior for system tables? > Add built in language based ordering and like processing to Derby > ----------------------------------------------------------------- > > Key: DERBY-1478 > URL: https://issues.apache.org/jira/browse/DERBY-1478 > Project: Derby > Issue Type: Improvement > Components: SQL > Affects Versions: 10.1.2.1 > Reporter: Kathey Marsden > Assigned To: Mamta A. Satoor > Attachments: DERBY-1478_FunctionalSpecV1.html > > > It would be good for Derby to have built in Language based ordering based on locale specific Collator. > Language based ordering is an important feature for international deployment. DERBY-533 offers one implementation option for this but according to the discussion in that issue National Character Types carry a fair amount of baggage with them especially in the form of concerns about conversion to and from datetime and number types. Rick mentioned SQL language for collations as an option for language based ordering. There may be other options too, but I thought it worthwhile to add an issue for the high level functional concern, so the best choice can be made for implementation without assuming that National Character Types is the only solution. > For possible 10.1 workaround and examples see: > http://wiki.apache.org/db-derby/LanguageBasedOrdering -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.