impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3742: Partitions and sort INSERTs for Kudu tables
Date Mon, 17 Apr 2017 21:22:03 GMT
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3742: Partitions and sort INSERTs for Kudu tables
......................................................................


Patch Set 4:

> > > Perf results from running on the 10 node cluster:
 > > >
 > > > For smaller queries that we were able to handle previously,
 > > there's
 > > > a regression of about 10% in overall query running time (all
 > > > averaged over 3 runs):
 > > > 240.13s with the patch vs. 224.58s previously for 200m rows
 > > > inserted
 > > > 472.97s with the patch vs. 433.05s previously for 400m rows
 > > > inserted
 > > > That's unfortunate, but can be improved in the future, e.g. by
 > > > codegen-ing the partition function.
 > > >
 > > > But, we can now handle significantly larger inserts - I was
 > > seeing
 > > > timeouts regularly at > 400m rows previously, but with the
 > patch
 > > > I've tested up to 6b row inserts without any timeouts.
 > >
 > > Those are good results, and the perf regression is fairly
 > > negligible, so let's not worry about that for now.
 > 
 > Currently KuduPartitionExpr::GetIntVal is very expensive, although
 > it is not currently the bottleneck we should.
 > During the conducted tests is soft memory limit reached?

While I think we should look at what we can do to improve the perf, I don't think this should
prevent us from getting this change in. The most important thing is that non-trivial DML statements
actually complete without timing out. Nobody will touch this otherwise. We can work on perf
after this change, but probably the issue is lack of codegen in the kudu client itself which
will be really hard to address and there is probably much lower hanging fruit to reclaim the
11%.

-- 
To view, visit http://gerrit.cloudera.org:8080/6559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I84ce0032a1b10958fdf31faef225372c5c38fdc4
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-HasComments: No

Mime
View raw message