impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dimitris Tsirogiannis (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4029: Reduce memory requirements for storing file metadata
Date Tue, 11 Apr 2017 04:58:26 GMT
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4029: Reduce memory requirements for storing file metadata
......................................................................


Patch Set 3:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/6406/3/common/fbs/CatalogObjects.fbs
File common/fbs/CatalogObjects.fbs:

Line 36:   offset: long = 0 (id: 0);
> Why a default value? Seems potentially dangerous.
Done


Line 40:   length: long = -1 (id: 1);
> Seems redundant, why keep it?
You mean why not compute it on the fly by iterating on the file blocks?


Line 65:   compression: FbCompression (id: 3);
> Will FlatBuffers add padding to align members? Ideally, we'd optimize for s
Hm, not sure I understand what the ask is here? Should I try to merge this with some other
field or something else?


http://gerrit.cloudera.org:8080/#/c/6406/3/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

Line 218:   1: required binary file_desc_data
> Leave a todo to put this in a KRPC sidecar :) No need to pay serialization 
Done


Line 218:   1: required binary file_desc_data
> Nice
Done


Line 296:   10: optional list<string> file_name_prefixes
> Is this required for the move to flat buffers? I think we should consider a
No, this is definitely not needed now. Actually, I feel the current implementation is too
flaky and not easy to explain. I'll remove it for now and leave a TODO. We can revisit the
file name compression later.


http://gerrit.cloudera.org:8080/#/c/6406/3/fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java
File fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java:

Line 78:   public byte toFbCompression() {
> toFlatBuffer()?
Done


http://gerrit.cloudera.org:8080/#/c/6406/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

Line 126:         locations = fileSystem.getFileBlockLocations(fileStatus, 0, fileStatus.getLen());
> I think it would be better for now to keep all the code that fetches inform
Done


Line 227:             loc.getNames().length);
> Another case where we might be calling out to the NN.
I don't think any of these statements is making a call to NN. Which one are you talking about?


Line 321:     private static int REPLICA_HOST_CACHE_MASK = 0x8000;
> Less code and more readable to have one var with (1 << 15). The places that
Done


http://gerrit.cloudera.org:8080/#/c/6406/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 189:   // List of file name prefixes.
> Sorted list of file name prefixes?
This was removed.


Line 309:         Pair<Integer, String> compressedFileName =
> How predictable is the compression behavior? Do we iterate over the files i
Removed.


Line 340:    * the suffix is equal to 'fileName'.
> Should mention that this function expects the fileNames to be sorted.
Removed.


Line 346:     String commonPrefix = Strings.commonPrefix(prevFileName, fileName);
> I'm wondering if it's better to first check the last common prefix instead 
Removed.


-- 
To view, visit http://gerrit.cloudera.org:8080/6406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I483d3cadc9d459f71a310c35a130d073597b0983
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message