impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tianyi Wang (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5210: Count rows and collection items in parquet scanner separately
Date Thu, 24 Aug 2017 17:56:58 GMT
Tianyi Wang has posted comments on this change.

Change subject: IMPALA-5210: Count rows and collection items in parquet scanner separately

Patch Set 2:

Commit Message:

PS2, Line 9: addes
> typo

PS2, Line 26: couting
> typo

PS2, Line 27: every
> for every...
File be/src/exec/

Line 963:   int64_t num_rows_read = 0, num_coll_items_read = 0;
> We usually limit declarations to one per line per our style guide.

Line 964:   auto update_read_count = MakeScopeExitTrigger([&] {
> I find that this construct makes the code harder to reason about. Now someo
Done. If RETURN_IF_ERROR or num_tuples_mismatch branch is taken the value of counter would
be different, but it could be OK since counter is for profiling purpose.

PS2, Line 978: continue_execution
> initialize
Reverted to original code. But is it good practice to add a dead initialization?
File be/src/exec/hdfs-parquet-scanner.h:

Line 475:   RuntimeProfile::Counter* num_dict_filtered_row_groups_counter_;
> Can you do the counting using a member in the scanner instead of passing it
Done. Much better.
File be/src/exec/

PS2, Line 242: /*num_coll_item_read*/
> We usually still name parameters the same, even if some methods don't use t
The parameter doesn't exist now. Can we add that to the style guide wiki page as a difference
from google C++ style guide?
File be/src/exec/parquet-column-readers.h:

PS2, Line 546: 4
> ?
Will fix.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I7f6efddaea18507482940f5bdab7326b6482b067
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Tianyi Wang <>
Gerrit-HasComments: Yes

View raw message