cayenne-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aristedes Maniatis <...@maniatis.org>
Subject Re: (CAY-1210) mysql does not use index for case insensitive searches
Date Wed, 12 Jan 2011 05:01:05 GMT
On 11/01/11 10:18 PM, Andrew Lindesay wrote:
> I think the functional indexes do make sense and it seems reasonable that a DBA would
apply them as part of tuning.  Maybe it would be best to have a page in the manual on this
explaining that for MySQL/... CI search it is best to change the collation on the schema objects
and for Oracle/PG/... CI search it is best to implement functional indexes?

It is all a bit ugly. Postgresql docs recommend using LOWER() to achieve ci search, whereas
Cayenne spits out UPPER(). Hibernate also produces LOWER(). So a db used by different systems
is going to need quite a few indexes.

Right now, I'm looking for a solution for Derby and mysql in particular, which both lack functional
indexes. And a MS-SQL installation which is already in ci 'mode' anyway and so adding functional
indexes is just a workaround for Cayenne executing inappropriate SQL for this environment.

Another small gotcha for developers is that Expression.filterObjects may be inconsistent with
the queries against the database, when the expectation is that it should be the same.

A ci database will always be ci, unless we add the BINARY keyword (I know that works for mysql,
not sure about everything else). [1] However that also bypasses the use of the index, so we
don't want to add it when it isn't needed. It is easy to be correct. But correct and fast
is much harder.


On 11/01/11 11:04 PM, Andrus Adamchik wrote:
> Maybe do a total hack to get us out of this limbo - how about a DI extension point for
CI LIKE SQL generation strategy (or rather a System property activating this strategy - 'default
collation case sensitivity')? This won't complicate the mapping, won't force us to generalize,
and will allow the same mapping to be used with both kinds of schemas.

Well, a database-wide property solves my particular use-case. Seems clunky (since it is database-wide
and not specific to columns), but if you are convinced that case-sensitivity is not an attribute
of the model, then this is the only way. I still can't help but think this is a property of
an attribute in the model though. Other than some JDBC specification, why is the choice between
BLOB, CLOB and TEXT a modeller property, but the choice between ci-TEXT and cs-TEXT isn't?

"BLOB values are treated as binary strings (byte strings). They have no character set, and
sorting and comparison are based on the numeric values of the bytes in column values. TEXT
values are treated as nonbinary strings (character strings). They have a character set, and
values are sorted and compared based on the collation of the character set." Effectively BLOBs
are just TEXT with collation = none. I've not tried to see what happens if you try to perform
a case insensitive search on BLOB from Cayenne. Will it try to output UPPER?



Out of interest, my googling discovered that Atlassian developers are running into this problem
with Hibernate. But they didn't have a nice solution. http://jira.atlassian.com/browse/CONF-10030

Regards

Ari



[1] http://dev.mysql.com/doc/refman/5.0/en/charset-binary-op.html

-- 
-------------------------->
Aristedes Maniatis
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A

Mime
View raw message