impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Impala Public Jenkins (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5431: Remove redundant path exists checks during table load
Date Tue, 27 Jun 2017 03:40:53 GMT
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5431: Remove redundant path exists checks during table load
......................................................................


IMPALA-5431: Remove redundant path exists checks during table load

There are multiple places that do an exists() check on a path and then
perform some subsequent action on it. This pattern results in two
RPCs to the NN (one for the exists() check and one for the subsequent
action). We can avoid the exists() check in these cases since most HDFS
methods on paths throw a FileNotFoundException if the path does not
exist. This can save an RPC to NN and improve the metadata loading time.

Testing: Enough tests already cover this code path. This patch
passed core and exhaustive tests.

Metadata benchmark shows decent increase in perf numbers, for ex:

100K-PARTITIONS-1M-FILES-CUSTOM-05-QUERY-AFTER-INV -20.51%
80-PARTITIONS-250K-FILES-S3-03-RECOVER -20.58%
80-PARTITIONS-250K-FILES-11-DROP-PARTITION -22.13%
80-PARTITIONS-250K-FILES-S3-08-ADD-PARTITION -22.38%
80-PARTITIONS-250K-FILES-S3-12-DROP -23.69%
100K-PARTITIONS-1M-FILES-CUSTOM-11-REFRESH-PARTITION -23.91%
100K-PARTITIONS-1M-FILES-CUSTOM-10-REFRESH-AFTER-ADD-PARTITION -26.04%
100K-PARTITIONS-1M-FILES-CUSTOM-07-REFRESH -26.38%
80-PARTITIONS-250K-FILES-S3-02-CREATE -36.47%
100K-PARTITIONS-1M-FILES-CUSTOM-12-QUERY-PARTITIONS -58.72%
80-PARTITIONS-250K-FILES-S3-01-DROP -95.33%
80-PARTITIONS-250K-FILES-01-DROP -95.93%

Change-Id: Id10ecf64ea2eda2d0f9299c0aa371933eca22281
Reviewed-on: http://gerrit.cloudera.org:8080/7095
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
2 files changed, 64 insertions(+), 39 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Bharath Vissapragada: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id10ecf64ea2eda2d0f9299c0aa371933eca22281
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins

Mime
View raw message