hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zoltan Haindrich (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-22929) Performance: quoted identifier parsing uses throwaway Regex via String.replaceAll()
Date Fri, 28 Feb 2020 08:21:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-22929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17047308#comment-17047308
] 

Zoltan Haindrich commented on HIVE-22929:
-----------------------------------------

[~kkasa] do we need the power of regexp to address the original issue? I think it was something
like:
{code}
str.replaceAll("``","`")
{code}
I think a special tailored utility function would probably perform even better...could you
see how this performs:
{code}
  public String fx(String s, char c) {
    StringBuilder sb = new StringBuilder();
    char[] cc = s.toCharArray();
    char l = 0;
    for (int i = 0; i < cc.length; i++) {
      char curr = cc[i];
      if (l == '`') {
        l = 0;
        continue;
      } else {
        l = curr;
      }
      sb.append(curr);
    }
    return sb.toString();
  }
{code}

note: for performance measurements you can write tests under the itests/hive-jmh ; there are
a few there already.

...I think another approach could be: since we have a lexer here (somewhere) ...we might be
able to convince it to process these escapings/etc for us - not sure at what cost it could
do that...

I would recommend to not change this all over the place - it might not affect general performance;
for example the performance gain at places like a ddltask is irrelevant...I think it's best
to focus on the performance issue at hand - the impact of the same issue is usually neglegible
at other places

[~gopalv] Could you please provide a sample query for this? it might be interesting to take
a look at it - in case it heats up something like a "String.replaceAll" function


> Performance: quoted identifier parsing uses throwaway Regex via String.replaceAll()
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-22929
>                 URL: https://issues.apache.org/jira/browse/HIVE-22929
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Gopal Vijayaraghavan
>            Assignee: Krisztian Kasa
>            Priority: Major
>         Attachments: HIVE-22929.1.patch, String.replaceAll.png
>
>
>  !String.replaceAll.png! 
> https://github.com/apache/hive/blob/master/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g#L530
> {code}
>     '`'  ( '``' | ~('`') )* '`' { setText(getText().substring(1, getText().length() -1
).replaceAll("``", "`")); }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message