impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Volker (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5210: Count rows and collection items in parquet scanner separately
Date Thu, 24 Aug 2017 23:13:43 GMT
Lars Volker has posted comments on this change.

Change subject: IMPALA-5210: Count rows and collection items in parquet scanner separately
......................................................................


Patch Set 4:

(3 comments)

Only a comment left.

http://gerrit.cloudera.org:8080/#/c/7776/3/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 1008:   row_group_rows_read_ += num_rows_read;
> I'm not sure what value should be COUNTER_SET here. Could you elaborate?
See below.


PS3, Line 1009: COUNTER_ADD
> Same as above.
I tried to get away without having to reset the member variable at the beginning of this function,
but I think that won't work. Adding a comment in the header to point out that it gets reset
here would be good.


http://gerrit.cloudera.org:8080/#/c/7776/3/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

Line 470:   /// Number of scanners that end up doing no reads because their splits don't overlap
> Updated. In current code it's total number of collection items read in curr
Ah, I see. I think I misunderstood the scope. It actually looks good to me. Can you add to
the comment that the variable is reset in AssembleRows()?


-- 
To view, visit http://gerrit.cloudera.org:8080/7776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7f6efddaea18507482940f5bdab7326b6482b067
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <twang@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Tianyi Wang <twang@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message