db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "geoff hendrey (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-1748) Global case insensitive setting
Date Thu, 19 Jun 2008 01:03:45 GMT

    [ https://issues.apache.org/jira/browse/DERBY-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606166#action_12606166

geoff hendrey commented on DERBY-1748:

It was mentioned on the derby-user mail list that "auto generated columns" might be a way
to deal with case-insensitive searching. It would be good to bring that discussion onto this
JIRA issue, so that it may be considered in the context of other proposed solutions. 

I re-emphasize that the ability to do case-insensitive LIKE comparisons is absolutely critical.
Think about virtually any application, such as a blog application. It's expected that one
does not have to enter somebody's username or blog posting with exact-matching case in order
to retrieve it. I have tries solutions such as using UPPER in the query. These work only for
trivially small tables. When the number of rows grows, I saw query times taking many seconds
(15 seconds! --it was a while back, but that is my recollection), when I used UPPER to perform
comparisons against a regular index of a string column.

Perhaps there is a hybrid solution, that uses, for example,  UPPER, in conjunction with an
index. What if we could force an index to store an uppercase version of the column? Then as
long as we used UPPER in our query, the search would be perfectly efficient, against the uppercase
index. Or more generally, what if could intercept any column value, before placing it into
the index, or updating the index, and apply a scalar (non-aggregate) built-in function to
the column value?

This is the existing syntax for creating an index:

ON table-Name ( Simple-column-Name [ ASC | DESC ]
    [ , Simple-column-Name [ ASC | DESC ]] * )

Here is a proposed modification, backwards compatible, to allow scalar (non-aggregate) functions
to be applied to columns in the index:

///--begin BNF-like syntax --///

CREATE [UNIQUE] INDEX index-Name ON table-Name ( Intercepted-column [,Intercepted-column]*


((Built-in-function '(' Simple-column-Name ')' |Simple-column-Name) [ASC|DESC]) [ , (Built-in-function
'(' Simple-column-Name ')' |Simple-column-Name) [ ASC | DESC ]] *

///---end BNF-like syntax --///

This allows us to do things like



basically we can apply scalar functions to the columns before they are indexed, and when the
index is updated.

> Global case insensitive setting
> -------------------------------
>                 Key: DERBY-1748
>                 URL: https://issues.apache.org/jira/browse/DERBY-1748
>             Project: Derby
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Terry
> By default MySQL is case insensitive in its string comparisons, as you can see from the
MySQL docs shown below. Similar functionality is available in Sybase iAnywhere and in SQLServer.
I'd like the same to be true for Derby.
> What, I wonder, are chances of that?
> I am aware that functions could be used to force comparisons in upper case but that subverts
the indexes and makes searches unacceptably long.
> If you were to ask people you might find that this is a feature whose abscence is causing
many to look elsewhere.
> thanks for all the great work,
> Terry
> The MySQL Docs say:
> -------- start quote
>  By default, MySQL searches are not case sensitive (although there are some character
sets that are never case insensitive, such as czech). This means that if you search with col_name
LIKE 'a%', you get all column values that start with A or a. If you want to make this search
case sensitive, make sure that one of the operands has a case sensitive or binary collation.
For example, if you are comparing a column and a string that both have the latin1 character
set, you can use the COLLATE operator to cause either operand to have the latin1_general_cs
or latin1_bin collation. For example:
> col_name COLLATE latin1_general_cs LIKE 'a%'
> col_name LIKE 'a%' COLLATE latin1_general_cs
> col_name COLLATE latin1_bin LIKE 'a%'
> col_name LIKE 'a%' COLLATE latin1_bin
> If you want a column always to be treated in case-sensitive fashion, declare it with
a case sensitive or binary collation. See Section 13.1.5, "CREATE TABLE Syntax". 
>     By default, the search is performed in case-insensitive fashion. In MySQL 4.1 and
up, you can make a full-text search by using a binary collation for the indexed columns. For
example, a column that has a character set of latin1 can be assigned a collation of latin1_bin
to make it case sensitive for full-text searches.
> --------------- end quote

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message