impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4840: Fix REFRESH performance regression.
Date Wed, 15 Feb 2017 20:07:41 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-4840: Fix REFRESH performance regression.
......................................................................


Patch Set 2:

(4 comments)

Getting close.

http://gerrit.cloudera.org:8080/#/c/6009/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 787:    * for modified files on HDFS. This method uses a FileSystem.listStatus() call
on the
Suggest something like:

This method is optimized for the case where the files in the partition have not changed dramatically.
It first uses FileSystem.listStatus() ... 

then you can mention the perf difference between these functions


Line 789:    * block locations using FileSystem.getFileBlockLocations() method. The initial
table
the FileSystem.getFileBlockLocations() method


Line 793:    * (up to ~40x slower in some cases) and hence it is implemented this way to optimize
suggest moving this wording to the top as suggested above


Line 836:       if (unknownDiskIdCount > 0 && LOG.isWarnEnabled()) {
remove the disk id warning as you suggested


-- 
To view, visit http://gerrit.cloudera.org:8080/6009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I859b9fe93563ba886d0b5db6db42a14c88caada8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message