Return-Path: Delivered-To: apmail-db-derby-commits-archive@www.apache.org Received: (qmail 59036 invoked from network); 22 Mar 2007 01:52:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 22 Mar 2007 01:52:41 -0000 Received: (qmail 35500 invoked by uid 500); 22 Mar 2007 01:52:49 -0000 Delivered-To: apmail-db-derby-commits-archive@db.apache.org Received: (qmail 35466 invoked by uid 500); 22 Mar 2007 01:52:48 -0000 Mailing-List: contact derby-commits-help@db.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: "Derby Development" List-Id: Delivered-To: mailing list derby-commits@db.apache.org Received: (qmail 35455 invoked by uid 99); 22 Mar 2007 01:52:48 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Mar 2007 18:52:48 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS X-Spam-Check-By: apache.org Received: from [140.211.11.130] (HELO eos.apache.org) (140.211.11.130) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Mar 2007 18:52:40 -0700 Received: from eos.apache.osuosl.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id 176E459A05 for ; Thu, 22 Mar 2007 01:52:20 +0000 (GMT) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Apache Wiki To: derby-commits@db.apache.org Date: Thu, 22 Mar 2007 01:52:19 -0000 Message-ID: <20070322015219.5319.36205@eos.apache.osuosl.org> Subject: [Db-derby Wiki] Update of "BuiltInLanguageBasedOrderingDERBY-1478" by MamtaSatoor X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Db-derby Wiki" for change notification. The following page has been changed by MamtaSatoor: http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478 ------------------------------------------------------------------------------ [[TableOfContents(2)]] == Design proposals == + In 10.3, a user might create a table with CHAR datatypes along with other datatypes. The task at hand is if the user has asked for territory based collation, then we want these CHAR datatypes to collate differently than the CHAR datatypes that exist today in 10.2 And if the user hasn't requested territory based collation, then we want these CHAR datatypes to collate the way they do in 10.2 today. So, in short, the CHAR datatype in 10.3 will have different collation behavior depending on what user has requested. But as far as the user is concerned, they are just SQL CHAR datatypes and not new SQL datatypes. + + In the original proposal, the intention was to introduce new internal CHAR datatype which extended current CHAR datatype in Derby. This would have been implemented by having a new format id associated with the new internal CHAR datatypes. But with that proposal, there was overhead associated with implementing new getter methods in DataValueFactory for this new internal datatype and the type compiler associated with the new internal datatype etc. The other issue with the proposal was that there are many places in the code today where we get character datatypes and all of those cases will have to be inidividually investigated to see which CHAR datatype implementation they should use. So, if the character datatype is getting instantiated for CHAR columns in system tables, then we should use existing CHAR datatype implementation. But, if they were getting instantiated for user table, then the new internal CHAR datatype should be instantiated. AND there will be places where we c an't determine which one of the two CHAR implementations should we use, for eg a string value in a query 'abc'. + + The second proposal(current) was based on the idea that CHAR with territory based collation differs from the CHAR with default collation in only one aspect and ie how they are collated. Rest everything is same. So, as long as we know at the collation time, which kind of collation we are dealing with, we should be fine and hence there is no need to generate new internal CHAR datatypes. With that proposal, at compile time, when we associate a DataTypeDescriptor (DTD) with a char column, we tell what kind of collation should be associated with that DTD. The collation associated can be UCS_BASIC/territory base/unknown. Char columns associated with SYS schemas will always have UCS_BASIC in DTD associated with them. Char columns from user schema will have UCS_BASIC/territory based depending on what user has requested through COLLATION attribute in the jdbc url at database create time. Char columns that are not associated with a specific schema will have their DTD marked with coll ation as unknown and later on, at the actual collation time, for eg like method, compare methods, their collation will be determined depending on what the other operand's collation is. If the collation of other operand is also unknown, then the collation attribute of such Char will default to whatever COLLATION attribute user has requested at database create time. So, as you can see, collation information will be saved at the column level in language layer. Store will follow the same granularity and it will write the collation type for each and every column in it's metadata (ie for char datatypes as well as non-char datatypes). This collation type will make sense for only char datatypes. For the other datatypes, collation type will be ignored. + + Some of the complexity is coming from the fact that a single database can have 2 different collations associated with it's columns, ie, SYS schema will always use UCS_BASIC for it's collation. But all the user schemas will use either UCS_BASIC/territory based collation. If the collation was of only one type for the entire database, the design/implementation would have been far easier and we could keep collation information at database level rather than column level. + The thread for these design proposals can be found on Derby Dev list at http://www.nabble.com/Collation-feature-discussion-tf3418026.html#a9559634 == Outstanding items ==