db-derby-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Db-derby Wiki] Update of "BuiltInLanguageBasedOrderingDERBY-1478" by MikeMatrigali
Date Tue, 05 Jun 2007 17:26:39 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Db-derby Wiki" for change notification.

The following page has been changed by MikeMatrigali:
http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478

------------------------------------------------------------------------------
  == Outstanding items ==
  '''[[GetText(Language changes)]]'''
  
- 1)When a character column is added using CREATE TABLE/ALTER TABLE, make sure that the correct
collate type is populated in the TypeDescriptor's scale field in the SYS.SYSCOLUMNS table.
In order to do this, CREATE TABLE/ALTER TABLE need to get their schema descriptor's collation
type. The collation type of the schema descriptor will decide the collation type of the character
columns defined in CREATE TABLE/ALTER TABLE. This comes from item 2 under Collation Determination
section on this page. ALTER TABLE changes should work for both ADD COLUMN and MODIFY character
column increase length. Part of the code changes required for this item has gone in revision
526385 and 526454 as part of DERBY-2530. Some more language changes and some store changes
need to go in for this task to finish. Those changes involve passing the collation type from
language to the store layer as part of CREATE TABLE and ALTER TABLE. Mike is planning to look
into that.
- 
- '''[[GetText(Question)]]''' Mike, are we done with this line item? I think the language
changes are all done for this line item. Am not sure about store changes.
- 
- 
- 2)Store needs a way to determine the collation type for a given DVD. This collation type
will then be saved in the column metadata. Provide the api on DVD to return the correct collation
type. 
- 
- '''[[GetText(Question)]]''' Will there be an api like DVD.getCollationType()?
- 
- '''[[GetText(Answer)]]''' Mike, sorry, I forget the requirement for this one. Do we still
have a todo here?
- 
  3)Set the correct collation type for return parameter from user defined functions when the
return type is a character string type.
  
  4)Some of the numbered Rules in Collation Derivation sections on this page need to be discussed
to make sure we are setting the collation right. This will make sure that in future, when
we do start to support different collation for different user schemas, things don't fall apart.
The discussion has been started by Dan in couple threads (http://www.nabble.com/Collation-and-parameter-markers-%28-%29-tf3866040.html#a10952369
and http://www.nabble.com/Collation-and-string-literals-limitation-in-SQL-standard--tf3868448.html#a10959872).
+ 
  '''[[GetText(Store changes)]]'''
  
+ o All store releated code changes have been implemented.
- 1)Store column level metadata for collate in Store. Store keeps a version number that describes
the structure of column level metadata. For existing pre-10.3 databases which get soft upgraded
to 10.3, the structure of column level metadata will remain same as 10.2 structure of column
level metadata, ie they will not include collate information in
- their store metadata. For any conglomerate created in a 10.3 new database or a 10.3 hard
- upgraded database a new version would be used in Store to include information about the
collation for each column's metadata stored. This means that during upgrade, store needs to
change the sturcture of column level metadata to include collate information.
- 
- 2)Check if any code needs to go in for sorter. Not sure if this falls under Language or
Store section. Mike pointed out that Derby has some template code specific to sorter.
  
  '''[[GetText(Testing)]]'''
  
- 1)Add tests for this feature. This a broad umbrella task but I do want to mention 3 specific
tests that we should be testing
+ 1)Add tests for this feature. This a broad umbrella task but I do want to mention some specific
tests that we should be testing. Currently
+ tests for this feature can be found in java/testing/org/apache/derbyTesting/functionTests/tests/lang/CollationTest.java
and 
+ java/testing/org/apache/derbyTesting/functionTests/tests/lang/CollationTest2.java.  Below
are test cases that need to be verified,
+ as they are verified will move them below to the completed task section.
  
  a)For both a newly created 10.3 database and an upgraded 10.3 database, make sure that metadata
continues to show the scale for character datatypes as 0 (rather than the collation type value).
That is, test that the scale of the character datatypes is always 0 and it didn't get impacted
negatively by the overloading of scale field as collation type in TypeDescriptor.
  
@@ -135, +124 @@

  connect 'jdbc:derby:c:/dellater/db1;create=true;territory=it;collation=TERRITORY_BASED';
  connect 'jdbc:derby:c:/dellater/db1;collation=UCS_BASIC';
  
- h8)Connect to a pre-10.3 database in soft upgrade mode
- connect 'jdbc:derby:c:/dellater/db102';
- 
- h9)Upgrade a pre-10.3 database
- connect 'jdbc:derby:c:/dellater/db102;upgrade=true';
- 
  i)When adding tests for ALTER TABLE, try both add character column AND increase the length
of an existing character column.
  
  j)Upgrade a pre-10.3 database and make sure the upgraded database continues to use UCS_BASIC
for all collations.
@@ -151, +134 @@

  
  '''[[GetText(Network Server)]]'''
  
+ o No network server changes are planned as part of the 10.3 Collation project.
- 1)At this point, I am not sure what kind of work (if any) will be involved for Network Server.
- 
- '''[[GetText(Performance/Desirable items)]]'''
- 
- 1)CollatorSQLChar has a method called getCollationElementsForString which currently gets
called by like method. getCollationElementsForString gets the collation elements for the value
of CollatorSQLChar class. But say like method is looking for pattern 'A%' and the value of
CollatorSQLChar is 'BXXXXXXXXXXXXXXXXXXXXXXX'. This is eg of one case where it would have
been better to get collation element one character of CollatorSQLChar value at a time so we
don't go through the process of getting collation elements for the entire string when we don't
really need. This is a performance issue and could be taken up at the end of the implementation.
Comments on this from Dan and Dag can be found in DERBY-2416. 
- 
- 2)TypeDescriptorImpl's readExternal and writeExternal methods currently save only collation
type of a character string type column in SYSCOLUMNS's COLUMNDATATYPE column. Collation derivation
of character string type column does not get saved anywhere. In this release of Derby, collation
derivation of these persistent character string type columns is always "implicit" and hence
it is safe even if we don't save the collation derivation anywhere. In readExternal method,
we can always initialize the collation derivation to be "implicit" for character string type
columns. But in some future release of Derby, it might be possible to define an explicit collation
type for a column using SQL's COLLATE clause. In such a case, the collation derivation of
persistent column's won't be implicit, rather it will be explicit. In order to support that,
may be we should consider saving the collation derivation starting Derby 10.3 itself. Look
at thread http://www.nabble.com/-jira--Created%3A-
 %28DERBY-2524%29-DataTypeDescriptor%28DTD%29-needs-to-have-collation-type-and-collation-derivation.-These-new-fields-will-apply-only-for-character-string-types.-Other-types-should-ignore-them.-p9842379.html
  
  '''[[GetText(Miscellaneous item)]]'''
- 
+  
  1)Make sure the space padding at the end of various character datatypes is implemented commented
correctly in javadocs. This padding is used in collation related methods. For eg check SQLChar.stringCompare
method.
  
  == Implemented items ==
@@ -209, +186 @@

  
  3)This item is related to item 2. When DVF gets called by store to create right DVD for
given formatid and collation type, for formatids associated with character datatypes, it 
first creates the base character datatype class which is say SQLChar. Then it calls getValue
method on the DVD with the RuleBasedCollator corresponding to the collation type as the parameter.
(This RuleBasedCollator will be null for UCS_BASIC collation). The getValue method returns
SQLChar or CollatorSQLChar depending on whether RuleBasedCollator is null or not. getValue
is the new method which has been added to the interface StringDataValue.
  
+ 4)When a character column is added using CREATE TABLE/ALTER TABLE, make sure that the correct
collate type is populated in the TypeDescriptor's scale field in the SYS.SYSCOLUMNS table.
In order to do this, CREATE TABLE/ALTER TABLE need to get their schema descriptor's collation
type. The collation type of the schema descriptor will decide the collation type of the character
columns defined in CREATE TABLE/ALTER TABLE. This comes from item 2 under Collation Determination
section on this page. ALTER TABLE changes should work for both ADD COLUMN and MODIFY character
column increase length. Part of the code changes required for this item has gone in revision
526385 and 526454 as part of DERBY-2530. Some more language changes and some store changes
need to go in for this task to finish. Those changes involve passing the collation type from
language to the store layer as part of CREATE TABLE and ALTER TABLE. 
+ 
+ 5)Store column level metadata for collate in Store. Store keeps a version number that describes
the structure of column level metadata. For existing pre-10.3 databases which get soft upgraded
to 10.3, the structure of column level metadata will remain same as 10.2 structure of column
level metadata, ie they will not include collate information in
+ their store metadata. For any conglomerate created in a 10.3 new database or a 10.3 hard
+ upgraded database a new version would be used in Store to include information about the
collation for each column's metadata stored. This means that during upgrade, store needs to
change the sturcture of column level metadata to include collate information.
+ 
+ 6)Calls to sorter did not need to change.  The current interface requires that a non-empty
template be passed in.  That template can be
+ used by the sorter to create new objects with correct collation.  
+ 
+ '''[[GetText(Tests)]]'''
+ 1)Add tests for this feature. This a broad umbrella task but I do want to mention some specific
tests that we should be testing. Currently
+ tests for this feature can be found in java/testing/org/apache/derbyTesting/functionTests/tests/lang/CollationTest.java
and 
+ java/testing/org/apache/derbyTesting/functionTests/tests/lang/CollationTest2.java.  Below
are test cases that have been implemented.
+ 
+ e)Make sure that a soft-upgraded pre-10.3 database continues to work with pre-10.3 release,
ie the store level column metadata structure should remain unchanged. This ties in with item
1) under Store section above.  Upgrade tests pass, we could add more test cases but the basic
change
+ is to base table and index level metadata which gets exercised by existing cases.  
+ 
+ h8)Connect to a pre-10.3 database in soft upgrade mode (tested for all pre 10.3 db's by
upgrade suite)
+ connect 'jdbc:derby:c:/dellater/db102'; 
+ 
+ h9)Upgrade a pre-10.3 database (tested for app pre 10.3 db's by upgrade suite)
+ connect 'jdbc:derby:c:/dellater/db102;upgrade=true';
+ 
+ == Related Bugs/Improvements currently not planned for 10.3 release ==
+ '''[[GetText(Type Bugs)]]'''
+ '''[[GetText(Type Improvements)]]'''
+ 
+ 
+ 1)CollatorSQLChar has a method called getCollationElementsForString which currently gets
called by like method. getCollationElementsForString gets the collation elements for the value
of CollatorSQLChar class. But say like method is looking for pattern 'A%' and the value of
CollatorSQLChar is 'BXXXXXXXXXXXXXXXXXXXXXXX'. This is eg of one case where it would have
been better to get collation element one character of CollatorSQLChar value at a time so we
don't go through the process of getting collation elements for the entire string when we don't
really need. This is a performance issue and could be taken up at the end of the implementation.
Comments on this from Dan and Dag can be found in DERBY-2416.
+ 
+ 2)TypeDescriptorImpl's readExternal and writeExternal methods currently save only collation
type of a character string type column in SYSCOLUMNS's COLUMNDATATYPE column. Collation derivation
of character string type column does not get saved anywhere. In this release of Derby, collation
derivation of these persistent character string type columns is always "implicit" and hence
it is safe even if we don't save the collation derivation anywhere. In readExternal method,
we can always initialize the collation derivation to be "implicit" for character string type
columns. But in some future release of Derby, it might be possible to define an explicit collation
type for a column using SQL's COLLATE clause. In such a case, the collation derivation of
persistent column's won't be implicit, rather it will be explicit. In order to support that,
may be we should consider saving the collation derivation starting Derby 10.3 itself. Look
at thread http://www.nabble.com/-jira--Created%3A-
 %28DERBY-2524%29-DataTypeDescriptor%28DTD%29-needs-to-have-collation-type-and-collation-derivation.-These-new-fields-will-apply-only-for-character-string-types.-Other-types-should-ignore-them.-p9842379.html
+ 
+ 
+ 
  == Related Pages ==
  
  Until DERBY-1478 is implemented, ["LanguageBasedOrdering"] page provides an intermediate
solution to achieve language based ordering.

Mime
View raw message