impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-CR](cdh5-trunk) IMPALA-3344: Simplify sorter and document/enforce invariants.
Date Fri, 03 Jun 2016 04:02:04 GMT
Tim Armstrong has submitted this change and it was merged.

Change subject: IMPALA-3344: Simplify sorter and document/enforce invariants.

IMPALA-3344: Simplify sorter and document/enforce invariants.

Clarify relationships between classes, clean up the previous mess
where every class was friends with the other so there's an actual
distinction between public and private members. TupleIterator
is now no longer tied to TupleSorter, just Run.

Document and enforce invariants in many cases.

Factor out some functions from large functions.

Simplify and document iterator logic.

Make management of buffers when iterating over output stream more
explicitly correct: either use MarkNeedToReturn() or attach block
to the batch as appropriate. The SortedRunMerger didn't handle
resource transfer correctly, except if all the memory came from
the batch's MemPool. This patch fixes the cases when resources
are attached to the batches, but not the 'need_to_return' case.
Document that SortedRunMerger requires 'deep_copy_input' to be true
if batches can have the 'need_to_return' flag set.

Also use the atomic block exchange operation when moving between
blocks in unpinned runs to prevent pin failures at that point.
I explicitly have avoided changing the hairy block management logic
when allocating buffers for merging, that will need addressing in
a follow-up patch.

Add a SpilledRuns counter so that it's more explicit that spilling

Added some tests for corner cases with empty and NULL strings.
Fixed a test that previously failed with OOM but now succeeds.

Benchmarking against old code initial revealed some regressions from
changes in inlining. Force inlining the TupleComparator::operator() and
iterator Next()/Prev() functions helped and performance seems similar or
slightly better on the targeted orderby benchmarks.

Change-Id: I9c619e81fd1b8ac50e257172c8bce101a112b52a
Reviewed-by: Tim Armstrong <>
Tested-by: Tim Armstrong <>
M be/src/exec/
M be/src/exprs/
M be/src/exprs/agg-fn-evaluator.h
M be/src/exprs/
M be/src/exprs/case-expr.h
M be/src/exprs/
M be/src/exprs/compound-predicates.h
M be/src/exprs/
M be/src/exprs/conditional-functions.h
M be/src/exprs/
M be/src/exprs/expr-context.h
M be/src/exprs/
M be/src/exprs/
M be/src/exprs/expr.h
M be/src/exprs/
M be/src/exprs/hive-udf-call.h
M be/src/exprs/
M be/src/exprs/is-not-empty-predicate.h
M be/src/exprs/
M be/src/exprs/literal.h
M be/src/exprs/
M be/src/exprs/null-literal.h
M be/src/exprs/
M be/src/exprs/scalar-fn-call.h
M be/src/exprs/
M be/src/exprs/slot-ref.h
M be/src/exprs/
M be/src/exprs/tuple-is-null-predicate.h
M be/src/runtime/
M be/src/runtime/row-batch.h
M be/src/runtime/
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/
M be/src/runtime/sorter.h
M be/src/runtime/tuple-row.h
M be/src/util/tuple-row-compare.h
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
M testdata/workloads/functional-query/queries/QueryTest/single-node-large-sorts.test
M testdata/workloads/functional-query/queries/QueryTest/sort.test
M testdata/workloads/functional-query/queries/QueryTest/spilling.test
M tests/query_test/
41 files changed, 1,262 insertions(+), 967 deletions(-)

  Tim Armstrong: Looks good to me, approved; Verified

To view, visit
To unsubscribe, visit

Gerrit-MessageType: merged
Gerrit-Change-Id: I9c619e81fd1b8ac50e257172c8bce101a112b52a
Gerrit-PatchSet: 22
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Tim Armstrong <>

View raw message