db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta Satoor" <msat...@gmail.com>
Subject Re: territory-based collations and optimizations for the LIKE operator
Date Mon, 25 Aug 2008 16:20:03 GMT
Rick, the behavior we expect for LIKE and = in a territory based
database can be explained by the following comment that I picked up
from DERBY-3166

**********************************************************
The following applies to territory based database
For LIKE, when comparing a pattern against a value string, we do the
comparison of collation elements(s) for one character at a time for
non-metacharacters. This is different than what is done for =
operation. For =, we compare the collation elements for the entire
string on left hand with the collation elements of the entire string
on the right hand side.
For eg say we are working with a territory where character 'z' has
same collation elements as 'xy'. For such a territory consider 2
clauses in WHERE clause
1)'zcb' = 'xycb'
2)'zcb' LIKE 'xy_b'
For case 1), we will return TRUE because the collation elements for
the entire string 'zcb' will match the collation elements of the
entire string 'xycb'.
For case 2) though, we will return FALSE because collation element(s)
for character 'z' does not match the collation element(s) for
character 'x'. So, as can be seen, the LIKE operation is one character
at a time whereas = operation is the entire string at a time.
In addition, the metacharacter _ in pattern for LIKE will consume
*one* character in the string value. So for an eg clause 'xycb' LIKE
'_cb' will return FALSE because metacharacter _ will consume 'x' and
since 'c' does not match 'y', we will return FALSE.
**********************************************************

Is this the information you were looking for?

thanks,
Mamta

On 8/22/08, Rick Hillegas <Richard.Hillegas@sun.com> wrote:
> According to the Developer's Guide section titled "Character-based collation
> in Derby", the LIKE operator behaves differently than the = operator. That
> is, if you are using a territory-based collation, two strings might = one
> another but not be LIKE one another (and vice-versa).
>
> In the meantime, according to the section in the Tuning Guide titled
> "Character string beginning with constant", LIKE expressions which begin
> with a constant (followed by a wildcard) can be transformed into indexable
> expressions involving the ">=" and "<" operators.
>
> To my muddled understanding, it seems that one of the following occurs:
>
> 1) The LIKE optimizations are disabled if you are using territory-based
> collation
>
> or
>
> 2) a LIKE expression may return different results depending on your indexes
> and/or the presence of trailing wildcards in your LIKE expression.
>
> (1) seems like a performance issue which needs to be documented. (2) seems
> like a correctness problem. What is the behavior that we expect?
>
> Thanks,
> -Rick
>

Mime
View raw message