impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bharath Vissapragada (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4029: Reduce memory requirements for storing file metadata
Date Thu, 23 Mar 2017 21:21:21 GMT
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4029: Reduce memory requirements for storing file metadata
......................................................................


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/6406/2/common/fbs/CMakeLists.txt
File common/fbs/CMakeLists.txt:

PS2, Line 64: message(${JAVA_FE_ARGS})
Looks like debug printing. Remove? (below too)


http://gerrit.cloudera.org:8080/#/c/6406/2/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java:

PS2, Line 67: storageIdtoInt
(Unrelated to your change). Looks obsolete.


http://gerrit.cloudera.org:8080/#/c/6406/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

Line 119:       FlatBufferBuilder fbb = new FlatBufferBuilder(1);
Preconditions.checkState(isBlockMdSupported(fs))?


PS2, Line 124: } else {
             :         locations = fileSystem.getFileBlockLocations(fileStatus, 0, fileStatus.getLen());
             :       }
Do we really need this? Per my understanding, we only fetch LocatedFileStatus'es going forward
as we don't want to do another RPC to getFileBlockLocations(). In another words, is there
some caller that requires this?


Line 141:         ListMap<TNetworkAddress> hostIndex) {
Preconditions.checkState(!isBlockMdSupported(fs))


PS2, Line 425: fbFileBlock_.replicaHostIdxs(replicaIdx);
- replicaHostIdxs is [ushort] as per fbs definition in which case idx should be a short? That
REPLICA_HOST_CACHE_MASK could be made made short and avoid typecasting in L361. 

- Also that way we could use only REPLICA_HOST_CACHE_MASK and use ~REPLICA_HOST_CACHE_MASK
for unmasking instead of a separate REPLICA_HOST_IDX_MASK. Feel free to ignore this if you
think that affects readability.


Line 463:   public static class FileDescStats {
For supportability purposes, I'm wondering if we should track the remote_network_addresses
and present a rolled up version of stats so that we can know which DNs to look at when we
see -ve disk IDs.


http://gerrit.cloudera.org:8080/#/c/6406/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

PS2, Line 344: =
nit: spacing


Line 365:   private static String findCommonPrefix(String a, String b) {
Use Strings.commonPrefix() from guava?


Line 1933:       Collections.sort(orderedFds, new Comparator<FileDescriptor>() {
Can we just make FileDescriptor implement Comparator<FileDescriptor> and add a compare()
method there. I think that'd be cleaner.


http://gerrit.cloudera.org:8080/#/c/6406/2/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

Line 93:   protected boolean storedInImpaladCatalogCache_ = false;
What is the motivation to track this? A quick look at the places it is used, it seems like
they are only called when it actually true (for example resetPartitions()). Are there callers
when its false too? Like tests?


-- 
To view, visit http://gerrit.cloudera.org:8080/6406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I483d3cadc9d459f71a310c35a130d073597b0983
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message