impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading
Date Sat, 21 Oct 2017 00:12:39 GMT
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
......................................................................


Patch Set 7:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@221
PS7, Line 221:     // are excluded because they are considered hidden from Impala's perspecitve
Shrink:

// Number of hidden files excluded from file metadata loading . More details at isValidDataFile().


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@226
PS7, Line 226:     // Number of files skipped while computeing the block metadata information.
Shrink:

// Number of files skipped from file metadata loading because the the files have not changed
since the last load. More details at hasFileChanged().


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@248
PS7, Line 248:     // If set to true, reloads the block metadata only when the files in this
path
reloads the file metadata (let's try to use the new "file metadata" terminology consistently)


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@249
PS7, Line 249:     // has changed since last load (more details in hasFileChanged()).
have changed


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@785
PS7, Line 785:    * much higher throughput of RPC calls for listStatus/listFiles. Based on
our
... RPC calls for listStatus/listFiles. For simplicity, the filesystem type is determined
based on the table's root path and not for each partition individually. Based on our experiments,
S3 showed ...


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@791
PS7, Line 791:    * This method is not optimized for tables with mixed partition types (partitions
mapped
I don'd think we need this much detail, how about folding a simple sentence into the existing
paragraph above, see suggestion above.


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@794
PS7, Line 794:    * This may not work well with the mixed partition types and needs to fixed.
Don't think this is needed.


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@819
PS7, Line 819:     // For tables without partitions, we have no block metadata to load.
remove ","


http://gerrit.cloudera.org:8080/#/c/8235/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@841
PS7, Line 841:           LOG.error("Encountered an error loading block metadata for table:
" +
Could this flood the log when HDFS is under pressure or there is an intermittent Kerberos
issue? I'm thinking we should aggregate these into a single log message. Do you think having
every single path helps with supportability? My feeling is it's more important to note how
many tasks failed, but I'll defer to you if you think having the paths is useful.



-- 
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 7
Gerrit-Owner: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple-impala@apache.org>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercegovac@cloudera.com>
Gerrit-Comment-Date: Sat, 21 Oct 2017 00:12:39 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message