impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables
Date Fri, 14 Apr 2017 20:25:28 GMT
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables
......................................................................


Patch Set 4:

Perf results from running on the 10 node cluster:

For smaller queries that we were able to handle previously, there's a regression of about
10% in overall query running time (all averaged over 3 runs):
240.13s with the patch vs. 224.58s previously for 200m rows inserted
472.97s with the patch vs. 433.05s previously for 400m rows inserted
That's unfortunate, but can be improved in the future, e.g. by codegen-ing the partition function.

But, we can now handle significantly larger inserts - I was seeing timeouts regularly at >
400m rows previously, but with the patch I've tested up to 6b row inserts without any timeouts.

-- 
To view, visit http://gerrit.cloudera.org:8080/6559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-HasComments: No

Mime
View raw message