impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kim Jin Chul (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3282: Adds regexp escape built-in function
Date Sat, 06 Jan 2018 13:10:19 GMT
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/8900 )

Change subject: IMPALA-3282: Adds regexp_escape built-in function
......................................................................


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/8900/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8900/2//COMMIT_MSG@10
PS2, Line 10: ".*\\+?^[](){}$!=:-#\n\r\t\v "
> Where does this list come from? Impala uses RE2 syntax, which does not esca
I've collected the special characters again.


http://gerrit.cloudera.org:8080/#/c/8900/2/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8900/2/be/src/exprs/expr-test.cc@4157
PS2, Line 4157:   TestStringValue("regexp_escape('Hello.world')", "Hello\\.world");
> I think the examples in this file could be easier for a reader to understan
Done. Thanks for the information.


http://gerrit.cloudera.org:8080/#/c/8900/2/be/src/exprs/expr-test.cc@4159
PS2, Line 4159:   TestStringValue("regexp_escape('Hello\\\\world')", "Hello\\\\world");
> It seems that the parameter to the regexp_escape function is escaped once s
Done. I've added a comment.


http://gerrit.cloudera.org:8080/#/c/8900/2/be/src/exprs/expr-test.cc@4185
PS2, Line 4185:   TestStringValue("regexp_escape('a.b*c\\\\d+e?f^g[h]i(j)k{l}m$n!o=p:q-r#s\nt\ru\tv\vw
x"
> We should also directly test that the escaping is correct for our other reg
Done. I've added some mixed case with other regexp_*.


http://gerrit.cloudera.org:8080/#/c/8900/2/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8900/2/be/src/exprs/string-functions-ir.cc@624
PS2, Line 624:   const string input = AnyValUtil::ToString(str);
> We can directly iterate with the pointer here. e.g. for (char* c = str.ptr;
Done


http://gerrit.cloudera.org:8080/#/c/8900/2/be/src/exprs/string-functions-ir.cc@627
PS2, Line 627:     const bool need_escape = special_character_set.find(c) != special_character_set.end();
> I think using std::find on the string literal might be faster than on a set
Done


http://gerrit.cloudera.org:8080/#/c/8900/2/be/src/exprs/string-functions-ir.cc@638
PS2, Line 638:       default: ss << "\\" << c; break;
> Use '\\'.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8900
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84c3e0ded26f6eb20794c38b75be9b25cd111e4b
Gerrit-Change-Number: 8900
Gerrit-PatchSet: 2
Gerrit-Owner: Kim Jin Chul <jinchul@gmail.com>
Gerrit-Reviewer: Jim Apple <jbapple-impala@apache.org>
Gerrit-Reviewer: Kim Jin Chul <jinchul@gmail.com>
Gerrit-Reviewer: Tianyi Wang <twang@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Comment-Date: Sat, 06 Jan 2018 13:10:19 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message