impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Volker (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] DRAFT IMPALA-5185: Skip pages based on Parquet::Statistics
Date Thu, 20 Jul 2017 06:22:26 GMT
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7354

Change subject: DRAFT IMPALA-5185: Skip pages based on Parquet::Statistics
......................................................................

DRAFT IMPALA-5185: Skip pages based on Parquet::Statistics

Already done:
  - Refactor row group skipping into context creation and processing
  - Split root level readers into constrained and non-constrained
  - Basic row skipping logic in scanner
  - Switch to absolute row numbers in column readers
  - Have a NextValueToRead() logic in column readers
  - Skipping rows in CollectionColumnReaders
  - Skipping pages in BoolColumnReader

What's still missing:
  - Cleaning up the SkipValue/SkipValueBatch methods
    - Add skipping support to the Parquet deserializer

Change-Id: I8eec838c5baf22167049f570dd0ef9762c5ae0a6
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/parquet-column-stats.cc
M be/src/exec/parquet-column-stats.h
M be/src/util/parquet-reader.cc
A gen_data.py
M tests/query_test/test_parquet_stats.py
9 files changed, 860 insertions(+), 90 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7354/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8eec838c5baf22167049f570dd0ef9762c5ae0a6
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Pooja Nilangekar <pooja.nilangekar@cloudera.com>

Mime
View raw message