impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-2494: Support for byte array encoded decimals in Parquet scanner
Date Fri, 20 Oct 2017 00:36:39 GMT
Tim Armstrong has posted comments on this change. ( )

Change subject: IMPALA-2494: Support for byte array encoded decimals in Parquet scanner

Patch Set 5:


Mostly just cleanup and simplification comments now.
File be/src/exec/hdfs-parquet-scanner.h:
I should have mentioned this earlier, but I think we mostly use CamelCase for type names inside
templates. We're not totally consistent throughout the codebase but that's the general pattern.
File be/src/exec/parquet-common.h:
PS5, Line 328: inline int ParquetPlainEncoder::Encode(
It looks like we could also deduplicate the Encode/Decode methods for FIXED_LEN_BYTE_ARRAY
with the same technique.
PS5, Line 350: Decode<Decimal4Value,parquet::Type::FIXED_LEN_BYTE_ARRAY>(const uint8_t*
nit: missing space
File be/src/exec/
PS4, Line 193: izeof(D
> for var len decimals, its encoded size is bytes required to store size + le
Thanks for the explanation.
File be/src/exec/
PS5, Line 112: //    Decimal4Value: General test case,
I find this a bit confusing still because the values being encoded aren't in the table. Could
we take it even further and have something like:

  struct DecimalTestCase {
    int decimal_byte_size;
    int128_t decimal_val;
    int expected_size;

That way all of the inputs to each test would be in the same place.
PS5, Line 200: sizeof(int32_t)
Is there any reason to keep the encoding overhead separate from the encoded size? I would
have thought we would just check the total size.

To view, visit
To unsubscribe, visit

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c0e881045109f337fecba53fec21f9cfb9e619e
Gerrit-Change-Number: 7822
Gerrit-PatchSet: 5
Gerrit-Owner: Bikramjeet Vig <>
Gerrit-Reviewer: Bikramjeet Vig <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-Comment-Date: Fri, 20 Oct 2017 00:36:39 +0000
Gerrit-HasComments: Yes

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message