impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3742: partitions DMLs for Kudu tables
Date Fri, 10 Mar 2017 03:10:55 GMT
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3742: partitions DMLs for Kudu tables

Patch Set 6:

File be/src/runtime/

PS6, Line 446: TupleRow* current_row = batch->GetRow(i);
             :       uint32_t partition;
             :       RETURN_IF_ERROR(partitioner_->Partition(current_row, &partition));
             :       RETURN_IF_ERROR(channels_[partition % num_channels]->AddRow(current_row));
As we discussed, I think Henry is right that we need to avoid the v-f-calls per row.

I think we can consider two approaches:
1) we can continue branching separately for kudu and for hash partitioning here, and calling
non virtual fns on the partitioner. It's still good to have pulled all the Kudu code into
a separate class.
2) the partitioner class you added can take row batches at a time as well as well as the channels_,
and handle the per-row calls internally.

#2 sounds cleaner but I'm open to suggestions.
File bin/

PS6, Line 124: 4cd7d85
btw this won't be in your change, right? we still need to get the Kudu API in
File common/thrift/Partitions.thrift:

PS6, Line 42: TKuduDataPartition
brief comment

PS6, Line 43: target_table_id
mention this is necessary for the kudu Partitioner API
File fe/src/main/java/org/apache/impala/analysis/

Line 683:     // Hdfs folder structure correctly.
, or the indexes to construct rows passed to the Kudu partitioning API.

(Or something similar)
File fe/src/main/java/org/apache/impala/planner/

PS6, Line 46: partition

PS6, Line 52: number

Line 55:   private DataPartition(
nit: comment should be called only by the static factory method for Kudu partitioned tables
File fe/src/main/java/org/apache/impala/planner/

PS6, Line 274:     List<Expr> partitionExprs = Lists.newArrayList(modifyStmt.getPartitionExprs());
             :     List<Integer> targetColIdxs = Lists.newArrayList(modifyStmt.getTargetColIdxs());
maybe just create these inline as params to the static constructor method below, since they're
not needed otherwise.
File fe/src/main/java/org/apache/impala/planner/

Line 153:     } else {
a brief comment here may be helpful
File testdata/workloads/functional-planner/queries/PlannerTest/kudu-update.test:

PS6, Line 64: [KUDU(]
seems to be printing the alias rather than the base table name as we see in the line above
File testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test:

PS6, Line 29: 01:EXCHANGE [KUDU(1)]
hm this looks fishy with unions. i'm not sure if the approach doesn't make sense with unions,
at least those with constant operands.

can you add a test case which has unions with selecting from a table, e.g.

upsert into foo
select x,y,z from a union all
select x,y,z from b

PS6, Line 117: upsert into functional_kudu.testtbl /*+ clustered */
             : select * from functional_kudu.testtbl
             : ---- PLAN
             : UPSERT INTO KUDU [functional_kudu.testtbl]
             : |
             : 01:SORT
             : |  order by: id ASC NULLS LAST
             : |
             : 00:SCAN KUDU [functional_kudu.testtbl]
             : ---- DISTRIBUTEDPLAN
             : UPSERT INTO KUDU [functional_kudu.testtbl]
             : |
             : 02:SORT
             : |  order by: id DESC NULLS LAST
             : |
             : 01:EXCHANGE [KUDU(]
             : |
             : 00:SCAN KUDU [functional_kudu.testtbl]
we'll wanna do something similar without the clustered hint later, as we discussed

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic10b3295159354888efcde3df76b0edb24161515
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <>
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Thomas Tauber-Marshall <>
Gerrit-HasComments: Yes

View raw message