impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bharath Vissapragada (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5431: Remove redundant path exists checks during table load
Date Mon, 26 Jun 2017 23:26:03 GMT
Hello Dimitris Tsirogiannis, Alex Behm,

I'd like you to reexamine a change.  Please visit

to look at the new patch set (#7).

Change subject: IMPALA-5431: Remove redundant path exists checks during table load

IMPALA-5431: Remove redundant path exists checks during table load

There are multiple places that do an exists() check on a path and then
perform some subsequent action on it. This pattern results in two
RPCs to the NN (one for the exists() check and one for the subsequent
action). We can avoid the exists() check in these cases since most HDFS
methods on paths throw a FileNotFoundException if the path does not
exist. This can save an RPC to NN and improve the metadata loading time.

Testing: Enough tests already cover this code path. This patch
passed core and exhaustive tests.

Metadata benchmark shows decent increase in perf numbers, for ex:

80-PARTITIONS-250K-FILES-S3-12-DROP -23.69%
80-PARTITIONS-250K-FILES-S3-01-DROP -95.33%
80-PARTITIONS-250K-FILES-01-DROP -95.93%

Change-Id: Id10ecf64ea2eda2d0f9299c0aa371933eca22281
M fe/src/main/java/org/apache/impala/catalog/
M fe/src/main/java/org/apache/impala/common/
2 files changed, 64 insertions(+), 39 deletions(-)

  git pull ssh:// refs/changes/95/7095/7
To view, visit
To unsubscribe, visit

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id10ecf64ea2eda2d0f9299c0aa371933eca22281
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Bharath Vissapragada <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>

View raw message