impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Volker (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] Build a ConcatenatedStreams wrapper for ScannerContext::Stream
Date Thu, 27 Jul 2017 00:24:01 GMT
Lars Volker has posted comments on this change.

Change subject: Build a ConcatenatedStreams wrapper for ScannerContext::Stream
......................................................................


Patch Set 1:

(13 comments)

Looks good to me overall, please see my comments.

http://gerrit.cloudera.org:8080/#/c/7513/1//COMMIT_MSG
Commit Message:

Line 16: TODO: Currently, the ConcatenatedStreams don't support reads which span
It's best to document the TODO in the ConcatenatedStreams class itself, since that's where
users of the class would look for limitations.


http://gerrit.cloudera.org:8080/#/c/7513/1/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

PS1, Line 330:   if (!ValidateCurrentIndex()) return -1;
Is there a reason why you cannot DCHECK here, too?


PS1, Line 346: while (streams_[current_idx_]->eosr()) {
             :     if (!GetNextStream()) break;
             :   }
This block looks like a candidate for a method that increments current_idx_ until it the current
stream is not at eosr or it's at the last one. Maybe SkipStreamIfEosr()? There's probably
a better name. :)


http://gerrit.cloudera.org:8080/#/c/7513/1/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

PS1, Line 219: boundary_buffer_
This buffer looks like a good candidate in case you want to stitch together reads over stream
boundaries at some point, using the buffer of either the current or the next stream.


PS1, Line 273: Encapsulates a list of concatenated streams which can be
"Concatenates a list of streams by reading them sequentially." ?


PS1, Line 274: For columns where none of the pages can be
             :   /// skipped, the list contains only one stream.
This is a detail of how the class can be used. Try to describe only what it does, but mention
as a limitation / TODO that the caller must make sure to not read across page boundaries.


Line 287:     ConcatenatedStreams(const vector<Stream*> streams)
pass by reference


Line 290:     ~ConcatenatedStreams() {}
I don't think you need to explicitly define it as empty, since that's the default.


PS1, Line 295:   all the 
missing word?


PS1, Line 299: the first stream
I think you can just say "the current stream", here and elsewhere.


PS1, Line 300: every stream
Shouldn't it say "the last stream" since you're concatenating them in order?


PS1, Line 313: into
from?


Line 317:     void AddStream(Stream* stream) { streams_.push_back(stream); }
If you already know all the streams when creating the class, then you could only use the ConcatenatedStreams(const
vector<Stream*>&) ctor. That would remove the possible states that the class is
in (can read more, eosr).


-- 
To view, visit http://gerrit.cloudera.org:8080/7513
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I026f7a005ba4d6fc412ca0884f4fa0459da9c885
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pooja Nilangekar <pooja.nilangekar@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message