impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] DRAFT - IMPALA-5498: Support for partial sorts
Date Fri, 23 Jun 2017 19:49:51 GMT
Thomas Tauber-Marshall has posted comments on this change.

Change subject: DRAFT - IMPALA-5498: Support for partial sorts
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7267/1/be/src/exec/partial-sort-node.cc
File be/src/exec/partial-sort-node.cc:

Line 104:       RETURN_IF_ERROR(sorter_->Open());
> I think you'll want to wait until the next call into GetNext() to re-open t
To be clear - what you're saying is that Open() will want to allocate memory for the new run,
but it may not be able to get it because its being used by row_batch.

So this is a perf issue, not a correctness issue, and it still makes sense to call Reset()
here.

I'll make this change, I just want to be sure that the memory ownership transfers are happening
correctly and there aren't any potential data races.


Line 115:   do {
> I think it would make sense to open the sorter here (if it's not already op
Done


http://gerrit.cloudera.org:8080/#/c/7267/1/be/src/exec/partial-sort-node.h
File be/src/exec/partial-sort-node.h:

PS1, Line 30: a single tuple
> can you clarify, this is a bit unclear
Done


PS1, Line 34: If a merge phase was performed in the sort, sorted rows are deep copied into
            : /// the output batch.
> I thought there wouldn't need to be a merge
Whoops, forgot to update this. This whole comment is just copied from SortNode.


http://gerrit.cloudera.org:8080/#/c/7267/1/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

PS1, Line 353:   2: required bool use_top_n;
> this will move to TSortType, right?
Done


PS1, Line 355: 3: optional i64 offset
> I don't think we'll need to implement the behavior to support this for part
Its also used with total sorts - we make a cost based decision about whether to execute a
sort with a limit as a top-n (implemented as a heap) or with a total sort that then only outputs
n rows.


-- 
To view, visit http://gerrit.cloudera.org:8080/7267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message