cayenne-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrus Adamchik <and...@objectstyle.org>
Subject Re: (CAY-1210) mysql does not use index for case insensitive searches
Date Wed, 12 Jan 2011 12:55:17 GMT

On Jan 12, 2011, at 7:01 AM, Aristedes Maniatis wrote:

> On 11/01/11 10:18 PM, Andrew Lindesay wrote:
>> I think the functional indexes do make sense and it seems reasonable that a DBA would
apply them as part of tuning.  Maybe it would be best to have a page in the manual on this
explaining that for MySQL/... CI search it is best to change the collation on the schema objects
and for Oracle/PG/... CI search it is best to implement functional indexes?
> 
> It is all a bit ugly. Postgresql docs recommend using LOWER() to achieve ci search, whereas
Cayenne spits out UPPER(). Hibernate also produces LOWER(). So a db used by different systems
is going to need quite a few indexes.

Not ideal, but DB indexes always depend on the data access patterns by the clients, not the
other way around. 

> On 11/01/11 11:04 PM, Andrus Adamchik wrote:
>> Maybe do a total hack to get us out of this limbo - how about a DI extension point
for CI LIKE SQL generation strategy (or rather a System property activating this strategy
- 'default collation case sensitivity')? This won't complicate the mapping, won't force us
to generalize, and will allow the same mapping to be used with both kinds of schemas.
> 
> Well, a database-wide property solves my particular use-case. Seems clunky (since it
is database-wide and not specific to columns), but if you are convinced that case-sensitivity
is not an attribute of the model, then this is the only way. 

From our discussion, it can be a property of the model (either DB-wide or per-column), or
a property of a specific DB deployment. Also I am not completely opposed to mapping case sensitivity
settings per DbAttribute, but like I said, I am very conservative about adding new mapping
abstractions, so I'd rather we try it in a less invasive way first (that would allow us to
test a new CI LIKE strategy) and then decide if it is worthwhile extending DbAttribute.

Anyways, I'll add the description of this interim solution to Jira.

> Other than some JDBC specification, why is the choice between BLOB, CLOB and TEXT a modeller
property, but the choice between ci-TEXT and cs-TEXT isn't?
> "BLOB values are treated as binary strings (byte strings). They have no character set,
and sorting and comparison are based on the numeric values of the bytes in column values.
TEXT values are treated as nonbinary strings (character strings). They have a character set,
and values are sorted and compared based on the collation of the character set." Effectively
BLOBs are just TEXT with collation = none.

There's no TEXT type in JDBC (and Cayenne). The only similarity between BLOB and CLOB is that
they are sequences of elements. Beyond that they are different data type (of course you can
represent anything as a BLOB, but that would be highly impractical). So not sure where this
analogy goes (and like I said above I am not completely against an extra CS property for *char*
types).

> Out of interest, my googling discovered that Atlassian developers are running into this
problem with Hibernate. But they didn't have a nice solution. http://jira.atlassian.com/browse/CONF-10030

Yeah, that shows that we are not inventing the problem at least.

Andrus
Mime
View raw message