impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3780: avoid many small reads past end of block
Date Thu, 30 Jun 2016 05:46:23 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3780: avoid many small reads past end of block
......................................................................


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/3518/3/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

Line 162:     read_past_buffer_size = ::min(read_past_buffer_size, max_buffer_size);
> this is fine, but why is this min() needed now whereas it wasn't before?
I changed the contract with the callback a little bit. Before, it was the responsibility of
each callback to return a size <= 8MB, otherwise it would hit a DCHECK in Read(). It makes
more sense to check this here right before allocating the scan range.


Line 262:       RETURN_IF_ERROR(boundary_buffer_->EnsureCapacity(requested_len));
> were there cases where requested_len is much larger than the number of byte
I don't think so, as far as I've seen most scanners only do large reads when they know the
expected size. E.g. a parquet column or a compressed block. This will actually save memory
for large reads, since each time we double the buffer we can't immediately free the previous
buffer.


http://gerrit.cloudera.org:8080/#/c/3518/3/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

PS3, Line 103: .
> ... (see GetNextBuffer()).
Done


http://gerrit.cloudera.org:8080/#/c/3518/3/be/src/runtime/string-buffer.h
File be/src/runtime/string-buffer.h:

PS3, Line 99: int
> this may conflict with Michael's 64-bit change (changed len's to 64-bits).
I'll rebase and then check this.


-- 
To view, visit http://gerrit.cloudera.org:8080/3518
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id90c5dea44f07dba5dd465cf325fbff28be34137
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message