db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brett Wooldridge <brett.wooldri...@gmail.com>
Subject Strange collation
Date Wed, 13 Apr 2011 01:21:08 GMT
I have perhaps a misunderstanding/mis-expectation of a database created with
collation TERRITORY_BASED:PRIMARY.

>From the manual:

"Territory based with collation strength PRIMARY. Specify this value to
make Derby behave similarly to many other databases, for which PRIMARY is
commonly the default. PRIMARY typically means that only differences in base
letters are considered significant, whereas differences in accents or case
are not considered significant."

So, I was _hoping_ that indeed Derby would behave similarly to many other
databases, but ...

Assume I create a database as follows:

   jdbc:derby:test;create=true;collation=TERRITORY_BASED:PRIMARY


And create a table as follows:


CREATE TABLE foo (

   id INTEGER GENERATED BY DEFAULT AS IDENTITY,

   column1 VARCHAR(255)

)


And insert a test row...


INSERT INTO foo (column1) VALUES ('test')


A "standard" query works as expected ...


SELECT * FROM foo WHERE column1 LIKE 'TE%'


... does indeed perform a case in-sensitive LIKE as desired, and returns
this row:


ID    COLUMN1

-------------------------------

1      test


So far, so good.  Now this query...


SELECT * FROM foo WHERE id = '1'   -- Note the quoted '1' rather than a raw
1


... returns this error:


Error: Comparisons between 'INTEGER' and 'CHAR (UCS_BASIC)' are not
supported. Types must be comparable. String types must also have matching
collation. If collation does not match, a possible solution is to cast
operands to force them to the default collation (e.g. SELECT tablename FROM
sys.systables WHERE CAST(tablename AS VARCHAR(128)) = 'T1')

SQLState:  42818

ErrorCode: -1


Unfortunately, this is VERY unlike other databases.  This is a trivial query
which I would expect to work.  If Derby knows the collation the the
database, why are casts not automatically performed?  I actually encountered
this error first with this query to list all tables:


SELECT T2.* FROM SYS.SYSTABLES systabs, TABLE
(SYSCS_DIAG.SPACE_TABLE(systabs.tablename)) AS T2

WHERE systabs.tabletype = 'T'

ORDER BY isindex, conglomeratename


Which returns a similar error whereby the only solution is to
CAST(systabs.tabletype AS CHAR).


My company is, or rather was, in the process of switching our product from
Derby with default (UCS_BASIC) collation, to TERRITORY_BASED:PRIMARY in
order to solve performance issues related to the inability to perform case
in-sensitive queries (because using a function like LCASE() will not use an
index).


Unfortunately, given the size of the code-base (several hundred thousand
lines), the fact that a myriad of these CAST related errors could be hiding
in nooks and crannies is forcing us to reconsider.  Even if we _could_ find
and correct all queries were something like an INTEGER is being compared to
a string, having to put CAST everywhere we might interact with system tables
seems gross.


Is there any solution to this?  Any connection mode, startup parameter, etc.
that we're missing?  Is there any reason Derby shouldn't perform these CAST
automatically?  No other database I've used (Oracle, SQL Server, MySQL,
PostgreSQL, ...) seem to have this problem.


Thanks,

Brett Wooldridge

Mime
View raw message