impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5955: Use totalSize tblproperty instead of rawDataSize.
Date Fri, 22 Sep 2017 03:39:04 GMT
Alex Behm has submitted this change and it was merged. (

Change subject: IMPALA-5955: Use totalSize tblproperty instead of rawDataSize.

IMPALA-5955: Use totalSize tblproperty instead of rawDataSize.

Today, Impala populates the 'rawDataSize' property
during COMPUTE STATS for the purpose of extrapolating
row counts based on file sizes.

After this patch Impala will populate 'totalSize' instead of
'rawDataSize'. The 'rawDataSize' is not populated or used.

Intended meaning/use of tblproperties:
- rawDataSize' is the estimated in-memory size of a table
  (without encoding and compression)
- 'totalSize' represents the on-disk size

Using the fields correctly is important for compatibility
with other users of the HMS such as Hive and SparkSQL.
For example, SparkSQL relies on the 'totalSize' for
join ordering.

- core/hdfs run passed

Change-Id: If7c2c4e1e99b297c849f9f0d18b2bef34ad811c6
Tested-by: Impala Public Jenkins
Reviewed-by: Alex Behm <>
M fe/src/main/java/org/apache/impala/catalog/
M fe/src/main/java/org/apache/impala/service/
M fe/src/test/java/org/apache/impala/planner/
M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
4 files changed, 25 insertions(+), 25 deletions(-)

  Impala Public Jenkins: Verified
  Alex Behm: Looks good to me, approved

To view, visit
To unsubscribe, visit

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If7c2c4e1e99b297c849f9f0d18b2bef34ad811c6
Gerrit-Change-Number: 8110
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Behm <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Bharath Vissapragada <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Impala Public Jenkins

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message