impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3823: Add timer to measure Parquet footer reads
Date Fri, 28 Oct 2016 22:50:07 GMT
Hello Henry Robinson, Matthew Jacobs, Dan Hecht,

I'd like you to reexamine a change.  Please visit

to look at the new patch set (#11).

Change subject: IMPALA-3823: Add timer to measure Parquet footer reads

IMPALA-3823: Add timer to measure Parquet footer reads

It's been observed that Parquet footer reads perform poorly especially
when reading from S3. This patch adds a timer "FooterProcessingTimer"
which keeps a track of the average time each split of each scan node
spends in reading and processing the parquet footer.

Added a new utility counter called SummaryStatsCounter which keeps
track of the min, max and average values seen so far from a set of
values. This counter is used to calculate the min, max and average
time taken to scan and process Parquet footers per query per node.

The RuntimeProfile has also been updated to keep a track of, display
and serialize this new counter to thrift.

BE tests have been added to verify that this counter works fine.

Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
M be/src/exec/
M be/src/exec/hdfs-parquet-scanner.h
M be/src/util/runtime-profile-counters.h
M be/src/util/
M be/src/util/
M be/src/util/runtime-profile.h
M common/thrift/RuntimeProfile.thrift
M tests/query_test/
8 files changed, 303 insertions(+), 16 deletions(-)

  git pull ssh:// refs/changes/71/4371/11
To view, visit
To unsubscribe, visit

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Sailesh Mukil <>

View raw message