impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kim Jin Chul (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3282: Adds regexp escape built-in function
Date Sat, 06 Jan 2018 13:10:19 GMT
Kim Jin Chul has posted comments on this change. ( )

Change subject: IMPALA-3282: Adds regexp_escape built-in function

Patch Set 2:

Commit Message:
PS2, Line 10: ".*\\+?^[](){}$!=:-#\n\r\t\v "
> Where does this list come from? Impala uses RE2 syntax, which does not esca
I've collected the special characters again.
File be/src/exprs/
PS2, Line 4157:   TestStringValue("regexp_escape('')", "Hello\\.world");
> I think the examples in this file could be easier for a reader to understan
Done. Thanks for the information.
PS2, Line 4159:   TestStringValue("regexp_escape('Hello\\\\world')", "Hello\\\\world");
> It seems that the parameter to the regexp_escape function is escaped once s
Done. I've added a comment.
PS2, Line 4185:   TestStringValue("regexp_escape('a.b*c\\\\d+e?f^g[h]i(j)k{l}m$n!o=p:q-r#s\nt\ru\tv\vw
> We should also directly test that the escaping is correct for our other reg
Done. I've added some mixed case with other regexp_*.
File be/src/exprs/
PS2, Line 624:   const string input = AnyValUtil::ToString(str);
> We can directly iterate with the pointer here. e.g. for (char* c = str.ptr;
PS2, Line 627:     const bool need_escape = special_character_set.find(c) != special_character_set.end();
> I think using std::find on the string literal might be faster than on a set
PS2, Line 638:       default: ss << "\\" << c; break;
> Use '\\'.

To view, visit
To unsubscribe, visit

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84c3e0ded26f6eb20794c38b75be9b25cd111e4b
Gerrit-Change-Number: 8900
Gerrit-PatchSet: 2
Gerrit-Owner: Kim Jin Chul <>
Gerrit-Reviewer: Jim Apple <>
Gerrit-Reviewer: Kim Jin Chul <>
Gerrit-Reviewer: Tianyi Wang <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-Comment-Date: Sat, 06 Jan 2018 13:10:19 +0000
Gerrit-HasComments: Yes

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message