db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <derby-...@db.apache.org>
Subject [jira] Updated: (DERBY-1386) Wrong results with query using LIKE and ESCAPE clause that includes "%", "\", and "_"
Date Fri, 09 Jun 2006 13:57:32 GMT
     [ http://issues.apache.org/jira/browse/DERBY-1386?page=all ]

Knut Anders Hatlen updated DERBY-1386:
--------------------------------------

    Attachment: derby-1386-v1.diff
                derby-1386-v1.stat

Attached a patch (derby-1386-v1.diff) which removes the old, unused
version of lessThanString (the so-called optimized version) and
rewrites the unoptimized version.

First a description of what lessThanString() is supposed to do
(somewhat simplified):

When an SQL statement is compiled, LIKE expressions are rewritten to
make them more efficient. For instance,

  SELECT * FROM T WHERE X LIKE 'abc_ef%'

is rewritten to

  SELECT * FROM T WHERE X >= 'abc' AND X < 'abd' AND X LIKE 'abc_ef%'

(Well, 'abc' and 'abd' will actually be padded with nulls up to the
max length of X, but you get the point...)

Rewriting the statement like this is done to reduce the number of LIKE
comparisons (which are more expensive than >= and <) and to make it
possible to use the index.

The role of lessThanString() is to generate the string used in the
less than expression ('abd' in the example above). This string is the
sub-string leading up to the first wildcard, but with the last
character incremented by one (and, again, the string is padded with
nulls).

Currently, lessThanString() does this:

  1) Find the first wildcard character, if it is preceded by an escape
     character go to the next wildcard. (The error is in this step,
     because the escape character can be escaped.)

  2) Take the character right before the first wildcard and increment
     it.

  3) Build a new string of the pattern string up to the incremented
     character, but without the escape characters.

  4) Pad the string with nulls and return it.

With the patch, it does the following:

  1) Build a new string based on the pattern string by walking the
     pattern string, skipping escape characters and stopping when the
     first unescaped wildcard is found.

  2) Increment the last character of the newly built string.

  3) Pad the string with nulls and return it.

Other changes in the patch:

  - the unused version of lessThanString() was removed

  - some comments about what lessThanString() does were changed since
    the code didn't do what they said

  - new test cases in lang/dynamicLikeOptimization.sql


Derbyall ran without errors on Sun JVM 1.5.0 / Solaris 10 x86.

Reviews would be most appreciated! Thanks!

> Wrong results with query using LIKE and ESCAPE clause that includes "%", "\", and "_"
> -------------------------------------------------------------------------------------
>
>          Key: DERBY-1386
>          URL: http://issues.apache.org/jira/browse/DERBY-1386
>      Project: Derby
>         Type: Bug

>     Versions: 10.2.0.0, 10.1.2.5
>     Reporter: A B
>     Assignee: Knut Anders Hatlen
>     Priority: Critical
>  Attachments: derby-1386-v1.diff, derby-1386-v1.stat
>
> After the fix for DERBY-1262 was checked in, I'm noticing that the following query now
returns different results.  Prior to the fix for DERBY-1262 the query returned 2 rows; now
it doesn't return any rows.
> create table escTable (c1 char(10));
> insert into escTable values ('%_\a');
> insert into escTable values ('%_b');
> insert into escTable values ('%c');
> insert into escTable values ('d');
> insert into escTable values ('%_\e');
> select c1 from escTable where c1 like '\%\_\\%' ESCAPE '\';
> Before DERBY-1262, the SELECT returned:
> C1
> ----------
> %_\a
> %_\e
> 2 rows selected
> Now it returns:
> C1
> ----------
> 0 rows selected
> Brief inspection of the query and data suggest to me that these new results (i.e. no
rows) are wrong, and that Derby should in fact return 2 rows/.
> Based on comments in DERBY-1262, I'm creating a new Jira issue for the regression since
it has been checked into the 10.1 maintenance branch.  I've set the priority to "Critical"
since this could potentially delay a 10.1.3 release--I.e. I don't think we'd want to release
10.1.3 knowing that we have a wrong results regression.  But if anyone thinks that's not the
correct priority, feel free to speak up.
> Other option, of course, is to back out the change for DERBY-1262 in 10.1 and then lower
the priority accordingly.  
> Input/feedback/comments would be appreciated.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message