impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hecht (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Date Tue, 07 Mar 2017 00:39:51 GMT
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Patch Set 8:

File be/src/runtime/

PS8, Line 108: if
iff, just to be super clear.

Line 116:   }
fine with me to keep, but isn't this the same as line 109?

Line 141:     ss << &*read_page_;
will it be possible to identify which page in the list printed below this is?

Line 166:   DCHECK(read_page_ == pages_.end());

PS8, Line 219: If the stream is pinned, the previous write page will remain pinned
             :   // with pin count 1 as a read page. If a read/write stream is pinned, the
             :   // page has pin count 2, so we need to unpin once
these two sentences seem to both be talking about pinned streams, but i'm not sure what they
are trying to conclude. i also don't understand this if-stmt that follows.

Why isn't it simply:

if (write_page_ != nullptr) UnpinPage(write_page_); ?

i.e. we need to unpin *for write* the page regardless of the state of the read-iterator, no?
 i.e. can't we separate out reasoning of the read and write iterators?

PS8, Line 232: read_page_ != pages_.end()
what does this condition mean? why isn't it just: if the stream is pinned for reading, need
to take a pin count for both read and write, otherwise need a pin count for only write?

Line 241:     RETURN_IF_ERROR(buffer_pool_->Pin(buffer_pool_client_, &new_page.handle));
does this purposely not go through PinPage()?

PS8, Line 250: (double_pin ? 2 : 1)
i guess because of this. maybe move this to be after the CreatePage() call and just add 1
in there and then let PinPage() do its job?

Line 348:     *pinned = true;
why was this needed?

PS8, Line 356: We need to double-pin the current write page if we also have a read iterator.
why? or maybe you mean just that we need to take a pin-count for reading even if it's already
pinned for write? if that's the case, I think we should just talk about "read pinning" and
"write pinning" separately (and this routine shouldn't have to muck with write pinning, right?)

PS8, Line 358: write_page_ == &page && read_page_ == pages_.end()
why? i thought we now do take a separate pin count for read and for write?

Line 387:       if (pinned_) UnpinPage(&page);
do we allow you to call UnpinStream() on an already unpinned stream? why?

PS8, Line 388: continue
i think this would be clearer as:

if (is_write_page || is_read_page) {

} else {


rather than the "continue" flow.
File be/src/runtime/buffered-tuple-stream-v2.h:

PS8, Line 68: .
maybe add: "for reading"

PS8, Line 69: access
read access?

Line 71: /// unpinned and therefore do not use the client's reservation and can be spilled
to disk.
comments in this paragraph are meant to further clarify that pinned vs. unpinned stream really
only applies to the read-iterator.

PS8, Line 75: event

PS8, Line 319: bytes_pinned
i guess should be BytesPinned() since not a simple accessor

PS8, Line 385: including any pages already deleted in 'delete_on_read'
             :   /// mode.
that's not intuitive. is there a good reason why we do that rather than make it always be
the size of the stream? and if so, seems like it should be called highwater_byte_size or something.

PS8, Line 392: num_rows_returned_
rows_returned_? (or rename that member, which might be better).

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message