impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-CR](cdh5-trunk) PREVIEW: Basic column-wise slot materialization in Parquet scanner.
Date Wed, 13 Apr 2016 17:05:46 GMT
Tim Armstrong has posted comments on this change.

Change subject: PREVIEW: Basic column-wise slot materialization in Parquet scanner.

Patch Set 1:


Is the long-term plan to keep the scratch batch in the row-wise format? It seems like this
should work ok cache-wise (batch should fit in cache, memory access pattern will have gaps
but a regular stride), but having the values densely packed would allow some optimisations
down the road. I suspect it would be slightly faster in the short term but I don't know if
it would have a long term impact.
File be/src/exec/

Line 1732:   // and return an output batch with relatively few rows.
The TODO describes the current intended behaviour, so that sounds right. I think sending small
batches up the tree is ok for selective scans.

Line 1737:       // Optimization for scans with selective filters/conjuncts: None of the
Is this factoring in accumulated disk buffers?

Line 1829: ReadValueBatch
Ignoring return value?

I think we need to be careful about propagating errors, since I think it could end badly if
there's a read error and we try to evaluate conjuncts or filters over bogus data.

The existing code avoids this by checking for errors every row.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I72a613fa805c542e39df20588fb25c57b5f139aa
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Alex Behm <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message