phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kliewkliew <...@git.apache.org>
Subject [GitHub] phoenix pull request #208: PHOENIX-3046 `NOT LIKE '%'` unexpectedly returns ...
Date Sat, 10 Sep 2016 02:26:18 GMT
Github user kliewkliew commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/208#discussion_r78269568
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/compile/ExpressionCompiler.java
---
    @@ -523,7 +523,12 @@ public Expression visitLeave(LikeParseNode node, List<Expression>
children) thro
                     byte[] wildcard = {StringUtil.MULTI_CHAR_LIKE};
                     StringUtil.fill(nullExpressionString, 0, pattern.length(), wildcard,
0, 1, false);
                     if (pattern.equals(new String (nullExpressionString))) {
    -                    return IsNullExpression.create(lhs, true, context.getTempPtr());
    +                    if (node.isNegate()) {
    +                        return LiteralExpression.newConstant(false, Determinism.ALWAYS);
    --- End diff --
    
    The specification is:
    ```
    5) Case:
    
                a) If M and P are character strings whose lengths are variable
                  and if the lengths of both M and P are 0, then
    
                     M LIKE P
    
                  is true.
    
                b) The <predicate>
    
                     M LIKE P
    
                  is true if there exists a partitioning of M into substrings
                  such that:
    
                  i) A substring of M is a sequence of 0 or more contiguous
                     <character representation>s of M and each <character repre-
                     sentation> of M is part of exactly one substring.
    
                 ii) If the i-th substring specifier of P is an arbitrary char-
                     acter specifier, the i-th substring of M is any single
                     <character representation>.
    
                iii) If the i-th substring specifier of P is an arbitrary string
                     specifier, then the i-th substring of M is any sequence of
                     0 or more <character representation>s.
    
                 iv) If the i-th substring specifier of P is neither an arbi-
                     trary character specifier nor an arbitrary string speci-
                     fier, then the i-th substring of M is equal to that sub-
                     string specifier according to the collating sequence of
                     the <like predicate>, without the appending of <space>
                     characters to M, and has the same length as that substring
                     specifier.
    
                  v) The number of substrings of M is equal to the number of
                     substring specifiers of P.
    
                c) Otherwise,
    
                     M LIKE P
    
                  is false.
    ```
    
    Given that `LEN(NULL)` is `NULL`, `WHERE col IS NOT LIKE '%'` fails cases `a`, `b.i`,
and `b.iii`; defaulting to case `c` and always returning false.
    
    However, I looked through the docs again and noticed the following:
    
    ```
             3) "M NOT LIKE P" is equivalent to "NOT (M LIKE P)".
    ```
    
    in which case `WHERE col IS *NOT LIKE* '%'` should return the inverse result set of `WHERE
col IS *LIKE* '%'` (and should compile to `WHERE col IS NULL`).
    
    I might have misinterpreted something but the specification seems to contradict itself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message