impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4923: reduce memory transfer for selective scans
Date Mon, 22 May 2017 21:17:19 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4923: reduce memory transfer for selective scans
......................................................................


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/6949/2/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 1031:     DCHECK_EQ(0, scratch_batch_->total_allocated_bytes());
> Where do the decompression buffers get freed?
IMPALA-5304 made that redundant (I missed doing this cleanup in that patch).


http://gerrit.cloudera.org:8080/#/c/6949/2/be/src/exec/parquet-scratch-tuple-batch.h
File be/src/exec/parquet-scratch-tuple-batch.h:

PS2, Line 48: MemPool
> It's not clear from the var names comments where the var-len data goes. Tha
I elaborated on the comments to make it a bit more explicit.


Line 50:   // Pool used to accumulate other memory such as decompression buffers that may
be
> may be referenced
Done


Line 109:     dst_batch->tuple_data_pool()->AcquireData(&aux_mem_pool, false);
> I would have thought that the var-len data like strings or collections can 
We need to transfer the allocations for var-len data regardless since we have no way to know
for sure that previously-returned batches don't reference some of that data.

So we could still run into a similar problem with excessive transfer of var-len data in some
cases but I think that's impossible to solve without larger changes to the memory ownership
model because it would require more precise tracking of which buffers may be referenced by
already-returned rows. Reducing the amount of fixed-length data transferred will help a lot
regardless by reducing the overall volume of lock contention.

It's also counter-productive if the data is dictionary-encoded.


Line 130:     if (num_output_batches > 1) return false;
> This new compaction has non-obvious caveats like this one, and I find the f
I don't think this is a significant problem for performance - if the scan is selective then
the scratch batch will have many fewer rows than the output batches and only a handful of
batches will hit this edge case.

Agree it's subtle but I think there will be subtleties regardless with our current memory
transfer model. I tried to keep this encapsulated in FinalizeTransfer()/TryCompact()/Reset()
so that this isn't spread out throughout the code.

I think the alternative has some issues that I would rather avoid. I think doing two passes
over the batch will be slower since the loop in ProcessScratchBatch() is already tight.

It would also complicate the logic around 'tuple_mem' because compacted memory lives temporarily
in the scratch batch before being moved to the destination batch - I think you would need
additional state in the scratch batch to track both the compacted tuples and the tuple buffer
that you're going to reuse.


Line 139:     for (int i = dst_batch->num_rows(); i < end_row; ++i) {
> Don't we have a CopyRows() for this in RowBatch?
That seems to copy the tuple pointers rather than the tuples themselves.


-- 
To view, visit http://gerrit.cloudera.org:8080/6949
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3773dc63c498e295a2c1386a15c5e69205e747ea
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message