impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zach Amsden (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4729: Implement REPLACE()
Date Fri, 03 Feb 2017 19:28:27 GMT
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4729: Implement REPLACE()
......................................................................


Patch Set 6:

(1 comment)

After pre-loading the data (lost the first few lines, can't figure out how to get more scrollback
in bash on Windows yet ;)

But we can see replace() now wins on all simple replacements.  The first version lost horribly
on replace of a single space with 17 spaces.  The fancier buffer sizing on expanding patterns
was actually required.

+------------------------------------------------------------------+
| sum(length(regexp_replace(l_comment, ' ', '                 '))) |
+------------------------------------------------------------------+
| 496964585                                                        |
+------------------------------------------------------------------+
Fetched 1 row(s) in 3.63s
[localhost:21000] > select sum(length(replace(l_comment, ' ', '                 '))) from
tpch.lineitem;
Query: select sum(length(replace(l_comment, ' ', '                 '))) from tpch.lineitem
Query submitted at: 2017-02-03 19:21:42 (Coordinator: http://impala-dev:25000)
Query progress can be monitored at: http://impala-dev:25000/query_plan?query_id=504aabb65f306ab1:7601c2c800000000
+-----------------------------------------------------------+
| sum(length(replace(l_comment, ' ', '                 '))) |
+-----------------------------------------------------------+
| 496964585                                                 |
+-----------------------------------------------------------+
Fetched 1 row(s) in 1.63s
[localhost:21000] > select sum(length(regexp_replace(l_comment, ' ', ''))) from tpch.lineitem;
Query: select sum(length(regexp_replace(l_comment, ' ', ''))) from tpch.lineitem
Query submitted at: 2017-02-03 19:21:58 (Coordinator: http://impala-dev:25000)
Query progress can be monitored at: http://impala-dev:25000/query_plan?query_id=9440a311864f0940:f89d1f3e00000000
+-------------------------------------------------+
| sum(length(regexp_replace(l_comment, ' ', ''))) |
+-------------------------------------------------+
| 137874248                                       |
+-------------------------------------------------+
Fetched 1 row(s) in 3.04s
[localhost:21000] > select sum(length(replace(l_comment, ' ', ''))) from tpch.lineitem;
Query: select sum(length(replace(l_comment, ' ', ''))) from tpch.lineitem
Query submitted at: 2017-02-03 19:22:09 (Coordinator: http://impala-dev:25000)
Query progress can be monitored at: http://impala-dev:25000/query_plan?query_id=c345d0f14967fd99:4b23ef8c00000000
+------------------------------------------+
| sum(length(replace(l_comment, ' ', ''))) |
+------------------------------------------+
| 137874248                                |
+------------------------------------------+
Fetched 1 row(s) in 1.54s
[localhost:21000] > select sum(length(regexp_replace(l_comment, 'e', 'I'))) from tpch.lineitem;
Query: select sum(length(regexp_replace(l_comment, 'e', 'I'))) from tpch.lineitem
Query submitted at: 2017-02-03 19:22:47 (Coordinator: http://impala-dev:25000)
Query progress can be monitored at: http://impala-dev:25000/query_plan?query_id=d8405ff45581ef67:7df386ab00000000
+--------------------------------------------------+
| sum(length(regexp_replace(l_comment, 'e', 'i'))) |
+--------------------------------------------------+
| 158997209                                        |
+--------------------------------------------------+
Fetched 1 row(s) in 2.84s
[localhost:21000] > select sum(length(replace(l_comment, 'e', 'I'))) from tpch.lineitem;
Query: select sum(length(replace(l_comment, 'e', 'I'))) from tpch.lineitem
Query submitted at: 2017-02-03 19:22:58 (Coordinator: http://impala-dev:25000)
Query progress can be monitored at: http://impala-dev:25000/query_plan?query_id=5f42bba668201666:623a45f200000000
+-------------------------------------------+
| sum(length(replace(l_comment, 'e', 'i'))) |
+-------------------------------------------+
| 158997209                                 |
+-------------------------------------------+
Fetched 1 row(s) in 1.63s
[localhost:21000] > select sum(length(regex_replace(l_comment, 'he', 'HE'))) from tpch.lineitem;
Query: select sum(length(regex_replace(l_comment, 'he', 'HE'))) from tpch.lineitem
Query submitted at: 2017-02-03 19:23:30 (Coordinator: http://impala-dev:25000)
ERROR: AnalysisException: default.regex_replace() unknown

[localhost:21000] > select sum(length(regexp_replace(l_comment, 'he', 'HE'))) from tpch.lineitem;
Query: select sum(length(regexp_replace(l_comment, 'he', 'HE'))) from tpch.lineitem
Query submitted at: 2017-02-03 19:23:37 (Coordinator: http://impala-dev:25000)
Query progress can be monitored at: http://impala-dev:25000/query_plan?query_id=134a4fc63cf86279:372eeb6f00000000
+----------------------------------------------------+
| sum(length(regexp_replace(l_comment, 'he', 'he'))) |
+----------------------------------------------------+
| 158997209                                          |
+----------------------------------------------------+
Fetched 1 row(s) in 1.73s
[localhost:21000] > select sum(length(replace(l_comment, 'he', 'HE'))) from tpch.lineitem;
Query: select sum(length(replace(l_comment, 'he', 'HE'))) from tpch.lineitem
Query submitted at: 2017-02-03 19:23:45 (Coordinator: http://impala-dev:25000)
Query progress can be monitored at: http://impala-dev:25000/query_plan?query_id=1541a32bc801a543:8efcb85300000000
+---------------------------------------------+
| sum(length(replace(l_comment, 'he', 'he'))) |
+---------------------------------------------+
| 158997209                                   |
+---------------------------------------------+
Fetched 1 row(s) in 1.53s
[localhost:21000] > select sum(length(regexp_replace(l_comment, 'comment', '//'))) from
tpch.lineitem;
Query: select sum(length(regexp_replace(l_comment, 'comment', '//'))) from tpch.lineitem
Query submitted at: 2017-02-03 19:24:22 (Coordinator: http://impala-dev:25000)
Query progress can be monitored at: http://impala-dev:25000/query_plan?query_id=5e453cd4807ee411:987abebe00000000
+---------------------------------------------------------+
| sum(length(regexp_replace(l_comment, 'comment', '//'))) |
+---------------------------------------------------------+
| 158997209                                               |
+---------------------------------------------------------+
Fetched 1 row(s) in 1.74s
[localhost:21000] > select sum(length(replace(l_comment, 'comment', '//'))) from tpch.lineitem;
Query: select sum(length(replace(l_comment, 'comment', '//'))) from tpch.lineitem
Query submitted at: 2017-02-03 19:24:30 (Coordinator: http://impala-dev:25000)
Query progress can be monitored at: http://impala-dev:25000/query_plan?query_id=854f49d5c5c3951f:271f7f2200000000
+--------------------------------------------------+
| sum(length(replace(l_comment, 'comment', '//'))) |
+--------------------------------------------------+
| 158997209                                        |
+--------------------------------------------------+
Fetched 1 row(s) in 1.33s
[localhost:21000] >

http://gerrit.cloudera.org:8080/#/c/5776/6/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

Line 2619:   /* Since "IF", "TRUNCATE" are keywords, need to special case these functions
*/
> update comment
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1780a7d8fee6d0db9dad148217fb6eb10f773329
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden <zamsden@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Zach Amsden <zamsden@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message