impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henry Robinson (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-2494: Support for byte array-encoded decimals in Parquet scanner
Date Fri, 18 Nov 2016 06:35:39 GMT
Henry Robinson has uploaded a new patch set (#3).

Change subject: IMPALA-2494: Support for byte array-encoded decimals in Parquet scanner
......................................................................

IMPALA-2494: Support for byte array-encoded decimals in Parquet scanner

 * Extend metadata checks to allow more than one possible physical type
   for a given logical type.
 * Change decimal decoding to handle non-fixed-length format in same path
   as fixed-length encoding.

Testing:

 * Query test that decodes both plain and dictionary-encoded decimals
   using binary encoding.

Perf:

 * Tested computing SUM(col) for 1 billion distinct dictionary-encoded
   decimal(12,2) values using FIXED_BYTE_ARRAY physical type encoding.
 * The overhead of decoding with the extra branch was measured at 1s;
   i.e. the per-decode overhead is 1ns.

Change-Id: If95171e65aa48f08b08b8e87f4555dc75e867977
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-common.h
M be/src/exec/parquet-metadata-utils.cc
A testdata/data/binary_decimal_dictionary.parquet
A testdata/data/binary_decimal_no_dictionary.parquet
A testdata/workloads/functional-query/queries/QueryTest/decimal-encodings.test
M tests/query_test/test_scanners.py
7 files changed, 135 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/5115/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5115
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If95171e65aa48f08b08b8e87f4555dc75e867977
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message