impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Kornacker (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4815, IMPALA-4817, IMPALA-4819: Populate Parquet Statistics for remaining types
Date Fri, 05 May 2017 21:02:11 GMT
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4815, IMPALA-4817, IMPALA-4819: Populate Parquet Statistics for remaining

Patch Set 6:


getting close
File be/src/exec/

Line 543:     if (col_idx < col_orders.size()) col_order = &col_orders[col_idx];
what does this mean if col_orders is not empty? (does the spec allow a partial col_orders?)
File be/src/exec/

Line 1016:   file_metadata_.__isset.column_orders = true;
do we have a test that checks the integrity of generated parquet data? if so, we should also
check for the existence of this field.
File be/src/exec/

Line 104:         StringValue* result = reinterpret_cast<StringValue*>(slot);
why no padding here?

do stats for char columns make sense, given that our char comparisons are broken?
File be/src/exec/parquet-column-stats.h:

Line 34: /// decode parquet::Statistics into slots.
> Done
i meant: update the comment to reflect the changes in this patch.

Line 108: 
> Done
please state explicitly that T must be one of the *Value classes
File be/src/exec/parquet-column-stats.h:

Line 105:   /// 'col_order'. Otherwise, returns false.
what if col_order == 0?

Line 172:   static void CopyToBuffer(StringBuffer* buffer, StringValue* value);
move to -base and remove template params in .inl.h?
File be/src/exec/parquet-column-stats.inline.h:

Line 123:   // The ParquetPlainEncoder interface expects mutable pointers.
several function signatures in parquet-common.h look wrong (output args at beginning, missing
const, etc.)

please leave a todo there to fix it (or simply fix it - the fact that it's wrong is spilling
over into your new code, which isn't good).
File testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test:

Line 290: select count(*) from functional_parquet.alltypessmall where string_col < "1"
> I'm not sure I understand your comment. The test in L274 filters all rows u
File testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test:

Line 418: # Test CHAR type columns.
our char implementation is broken, i'm not sure these tests are meaningful.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I3ef4a5d25a57c82577fd498d6d1c4297ecf39312
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <>
Gerrit-Reviewer: Attila Jeges <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message