impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3823: Add timer to measure Parquet footer reads
Date Tue, 13 Sep 2016 00:47:18 GMT
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3823: Add timer to measure Parquet footer reads

Patch Set 3:

File be/src/exec/hdfs-parquet-scanner.h:

PS3, Line 430: avg_timer_
let's update the variable names too. e.g. something like process_footer_timer_stats_ or such
File be/src/util/runtime-profile-counters.h:

PS2, Line 240: t TSummaryStatsCount
> Yes that's right. The only reason I kept it as a Counter instead of a value

I think it's more useful to remove the restriction. Even in hdfs-parquet-scanner where you're
using this, there is a Counter constructed just for it's timer, it's not included in a profile.
That code might as well have just used a timing function explicitly. (I'm not saying it's
worth changing that code, just that there's no dependency on this taking Counters.)

Also, I don't think we get much from checking the Counter's unit. We don't really set counter
units carefully in many cases (e.g. we use UNIT all over the place instead of anything useful),
and ultimately it's up to the user of this interface to make sure what they're providing here
makes sense.
File be/src/util/runtime-profile-counters.h:

PS3, Line 240: Overwrite
"SetStats()" to be more consistent with some of the other fns, e.g. SetSamples(). If I saw
Overwrite() I'd wonder why there wasn't just a copy constructor or something like that.
File be/src/util/

Line 661:   }
> Done
File be/src/util/

PS3, Line 668: - 
I don't see this dash in the example above? Is it intentional? I see the event markers under
an event sequence make a list, though those are items of a specific timeline. I don't see
why that'd be the case here.

Also value() is an average, right? I think it'd be good to have that specified.

It'd be nice to include the number of samples as well. Also, avg isn't really all that different
from the other stats, so maybe just print them all the same way, e.g.:

" <name>:  avg: <avgval>, min: <minval>, max: <maxval>, num_samples:

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Sailesh Mukil <>
Gerrit-HasComments: Yes

View raw message