impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Ho (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-1619: Support 64-bit allocations.
Date Fri, 24 Jun 2016 18:22:25 GMT
Michael Ho has posted comments on this change.

Change subject: IMPALA-1619: Support 64-bit allocations.
......................................................................


Patch Set 7:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/2781/7//COMMIT_MSG
Commit Message:

PS7, Line 10: bug decompressor
> garbled
Done


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/exec/delimited-text-parser.cc
File be/src/exec/delimited-text-parser.cc:

Line 123:           row_end_locations, field_locations, num_tuples, num_fields, next_column_start));
> did you run a quick text parsing benchmark to see what effect, if any, thes
Not really. Is there any recommendation on the benchmark to run ?


Line 161:         DCHECK(status.ok());
> why is that the case?
The first argument length is 0 so we are guaranteed to succeed.


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

PS7, Line 168: byte_size
> any reason to not just make byte_size() return int64?
That would affect more callers which may require updating various other classes (e.g. Tuple,
various scanners) to be consistent.  May not be worth it unless we plan to support 64-bit
tuple size. Please let me know what you think.


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

PS7, Line 62: 1GB only as StringValue and StringVal use 32-bit
> this doesn't quite explain it given that 32-bit can go up to one before 2GB
Done


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

PS7, Line 558: >
> how did this get reversed?
Looks like it's this way when it was first introduced in commit 406a3188.


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/runtime/collection-value-builder.h
File be/src/runtime/collection-value-builder.h:

PS7, Line 40: static_cast<int64_t>
> should we just make initial_tuple_capacity be an int64_t?
Done


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/runtime/mem-pool.h
File be/src/runtime/mem-pool.h:

Line 114:   uint8_t* EmptyAllocPtr() { return reinterpret_cast<uint8_t *>(&zero_length_region_);
}
> maybe make this static?  Seems more like a global constant rather than a pe
Done


Line 215:     if (UNLIKELY(size < 0)) return NULL;
> when would this happen legitimately? wondering if it should be a DCHECK ins
Converted to DCHECK.


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/runtime/string-buffer.h
File be/src/runtime/string-buffer.h:

PS7, Line 31: size_
> buffer_size_?
Done


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/util/bit-util.h
File be/src/util/bit-util.h:

PS7, Line 206: positive
> shouldn't this say "non-negative". i.e. 0 is okay, right?
Done


PS7, Line 208: Is32BitPositive
> if so, how about IsNonNegative32Bit()?  (Also switched the order since that
Done


Line 209:     return (value & ~((1ULL << 31) - 1)) == 0;
> A more standard and IMO readable (and just as fast) way is:
Done


http://gerrit.cloudera.org:8080/#/c/2781/7/be/src/util/decompress.cc
File be/src/util/decompress.cc:

Line 446:         return Status(TErrorCode::SNAPPY_DECOMPRESS_INVALID_COMPRESSED_LENGTH);
> why did the old code have uncompressed_total_len? was that just wrong?
The old code actually will not flag an error if there are some legitimate uncompressed blocks
(i.e. uncompressed_total_len > 0) and return the partial uncompressed block as the uncompressed
length. I am not sure if there is any benefit to such behavior so this change just simply
flag an error if there is anything wrong with the file.


http://gerrit.cloudera.org:8080/#/c/2781/7/infra/python/bootstrap_virtualenv.py
File infra/python/bootstrap_virtualenv.py:

Line 123:   impala_home = os.environ.get("IMPALA_HOME", "")
> is this variable needed?
Removed.


-- 
To view, visit http://gerrit.cloudera.org:8080/2781
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7ed28083d809a86d801a9c063a0aa32c50d32b20
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message