impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.
Date Sun, 15 Jan 2017 00:25:58 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 3:

(2 comments)

I'm not set on implementing the logging this way, but shipping the current code as is will
be a supportability nightmare. Bharath and I have felt the impact of the lack of log messages
while debugging a problem in an internal cluster.

I'm happy to implement these changes in a different way if you have a suggestion.

http://gerrit.cloudera.org:8080/#/c/5709/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

PS3, Line 241: BlockMetadataLoadStats
> I don't see any block-metadata-specific stats, so I am wondering if the nam
The file counts are specific to the block metadata loading. For full loads of partitioned
tables, the common case is that we recursively iterate over *all* files in the table's root
directory and then map the files to partitions based on the their containing directory. This
means that there could be files in the table's root dir that don't belong to any partition.
So the number of files scanned and the number of files actually belonging to the table could
be different, and a large discrepancy could explain loading perf regressions (if any).

I'm ok with removing this change if you prefer.


PS3, Line 244: // Number of files that were found to belong to the table being loaded.
             :     public long tableFileCount = 0;
> Not sure I understand what this field represents? Can you please explain ?
See above comment, hope it explains it.

Let me know if you want to keep or remove this. If you prefer to keep, I'll expand the comments.


-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message