impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3823: Add timer to measure Parquet footer reads
Date Sun, 11 Sep 2016 03:09:31 GMT
Sailesh Mukil has uploaded a new change for review.

Change subject: IMPALA-3823: Add timer to measure Parquet footer reads

IMPALA-3823: Add timer to measure Parquet footer reads

It's been observed that Parquet footer reads perform poorly especially
when reading from S3. This patch adds a timer "FooterProcessingTimer"
which keeps a track of the average time each split of each scan node
spends in reading and processing the parquet footer.

Added a new utility counter called MinMaxAvgValueCounter which keeps
a track of the min, max and average values seen so far from a set of
values. This counter is used to calculate the min, max and average
time taken to scan and process Parquet footers per query per node.
This is also displayed in the RuntimeProfile.

The RuntimeProfile has also been updated to keep a track of, display
and move this new MinMaxAvgValueCounter between nodes through Thrift.

A test has been added to test that this counter works fine when there
are multiple blocks to scan per node.

Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
M be/src/exec/
M be/src/exec/hdfs-parquet-scanner.h
M be/src/util/runtime-profile-counters.h
M be/src/util/
M be/src/util/runtime-profile.h
M common/thrift/RuntimeProfile.thrift
M tests/query_test/
7 files changed, 216 insertions(+), 4 deletions(-)

  git pull ssh:// refs/changes/71/4371/1
To view, visit
To unsubscribe, visit

Gerrit-MessageType: newchange
Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <>

View raw message