impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Knupp (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-9362: Upgrade sqlparse 0.1.19 -> 0.3.1
Date Wed, 08 Apr 2020 22:13:24 GMT
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/15642 )

Change subject: IMPALA-9362: Upgrade sqlparse 0.1.19 -> 0.3.1
......................................................................


Patch Set 6:

> Patch Set 6:
> 
> I think we can probably live with a 10-15% regression for extreme cases like that, it
would be a bit heroic to try to fix it in sqlparse.

I was more wondering if there were things that we could do in our code. It does seem like
sqlparse itself is slower, but just doing something like this before stripping leading comments:

    # Only call sqlparse if indicated
    if sql.lstrip() and sql.lstrip()[0] in ('-', '/'):
      stack = sqlparse.engine.FilterStack()
      strip_leading_comment_filter = StripLeadingCommentFilter()
      stack.stmtprocess.append(strip_leading_comment_filter)
      stack.postprocess.append(sqlparse.filters.SerializerUnicode())
      stripped_line = ''.join(stack.run(sql, 'utf-8'))
      return strip_leading_comment_filter.comment, stripped_line
    else:
      return None, sql.decode('utf-8')

...all the tests still pass, and the time improves:

# sqlparse-0.3.1
Time to parse large sql: 0.771239995956
Time to parse large sql: 0.756343841553
Time to parse large sql: 0.755437135696
Time to parse large sql: 0.760624885559
Time to parse large sql: 0.764199018478
Time to parse large sql: 0.753574132919
Time to parse large sql: 0.75120306015
Time to parse large sql: 0.753540039062
Time to parse large sql: 0.764436006546

I'm happy to do that if you think it's worth it, or I'm happy ship as is if you're fine with
the current patch.


-- 
To view, visit http://gerrit.cloudera.org:8080/15642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77a1fd5ae311634a18ee04b8c389d8a3f3a6e001
Gerrit-Change-Number: 15642
Gerrit-PatchSet: 6
Gerrit-Owner: David Knupp <dknupp@cloudera.com>
Gerrit-Reviewer: David Knupp <dknupp@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stakiar@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Apr 2020 22:13:24 +0000
Gerrit-HasComments: No

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message