impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vuk Ercegovac (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls (wip)
Date Fri, 10 Nov 2017 18:47:50 GMT
Vuk Ercegovac has uploaded this change for review. (

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls (wip)

IMPALA-5931: Generates scan ranges in planner for s3/adls (wip)

Currently, for filesystems that do not include physical
block information (e.g., block replica locations, caching),
synthetic blocks are generated and stored in the catalog
when metadata is loaded. Example file systems for which this is done
includes S3, ADLS, and local fs.

This change avoids generating these blocks when metadata is loaded.
Instead, scan ranges are directly generated from such files by the
HDFSScanNode when planning. As a result, less space is used for the
catalog and less nework bandwidth is needed during its replication.
In addition a bug is avoided where non-splittable files were being
split anyways to support the query parameter that places a limit on
scan ranges.

The WIP status is there pending tests for s3 and adls as well as
to get initial feedback on the approach. Main thing I'm looking for
is whether there are thoughts on pushing more of the logic directly
into the coordinator.

- local filesystem tests exercise this code path
- manually tried larger local filesystem tables (tpch) with multiple
  partitions and observed the same scan ranges.
- TODO: s3 and adls testing

Change-Id: I544650c87cf6e4ddc984079294393cc6571de355
M fe/src/main/java/org/apache/impala/catalog/
M fe/src/main/java/org/apache/impala/catalog/
M fe/src/main/java/org/apache/impala/planner/
3 files changed, 153 insertions(+), 114 deletions(-)

  git pull ssh:// refs/changes/18/8518/1
To view, visit
To unsubscribe, visit

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I544650c87cf6e4ddc984079294393cc6571de355
Gerrit-Change-Number: 8518
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac <>

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message