impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dimitris Tsirogiannis (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4029: Reduce memory requirements for storing file metadata
Date Tue, 11 Apr 2017 04:58:26 GMT
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4029: Reduce memory requirements for storing file metadata

Patch Set 3:

File common/fbs/CatalogObjects.fbs:

Line 36:   offset: long = 0 (id: 0);
> Why a default value? Seems potentially dangerous.

Line 40:   length: long = -1 (id: 1);
> Seems redundant, why keep it?
You mean why not compute it on the fly by iterating on the file blocks?

Line 65:   compression: FbCompression (id: 3);
> Will FlatBuffers add padding to align members? Ideally, we'd optimize for s
Hm, not sure I understand what the ask is here? Should I try to merge this with some other
field or something else?
File common/thrift/CatalogObjects.thrift:

Line 218:   1: required binary file_desc_data
> Leave a todo to put this in a KRPC sidecar :) No need to pay serialization 

Line 218:   1: required binary file_desc_data
> Nice

Line 296:   10: optional list<string> file_name_prefixes
> Is this required for the move to flat buffers? I think we should consider a
No, this is definitely not needed now. Actually, I feel the current implementation is too
flaky and not easy to explain. I'll remove it for now and leave a TODO. We can revisit the
file name compression later.
File fe/src/main/java/org/apache/impala/catalog/

Line 78:   public byte toFbCompression() {
> toFlatBuffer()?
File fe/src/main/java/org/apache/impala/catalog/

Line 126:         locations = fileSystem.getFileBlockLocations(fileStatus, 0, fileStatus.getLen());
> I think it would be better for now to keep all the code that fetches inform

Line 227:             loc.getNames().length);
> Another case where we might be calling out to the NN.
I don't think any of these statements is making a call to NN. Which one are you talking about?

Line 321:     private static int REPLICA_HOST_CACHE_MASK = 0x8000;
> Less code and more readable to have one var with (1 << 15). The places that
File fe/src/main/java/org/apache/impala/catalog/

Line 189:   // List of file name prefixes.
> Sorted list of file name prefixes?
This was removed.

Line 309:         Pair<Integer, String> compressedFileName =
> How predictable is the compression behavior? Do we iterate over the files i

Line 340:    * the suffix is equal to 'fileName'.
> Should mention that this function expects the fileNames to be sorted.

Line 346:     String commonPrefix = Strings.commonPrefix(prevFileName, fileName);
> I'm wondering if it's better to first check the last common prefix instead 

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I483d3cadc9d459f71a310c35a130d073597b0983
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Bharath Vissapragada <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message