Return-Path: X-Original-To: apmail-drill-dev-archive@www.apache.org Delivered-To: apmail-drill-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3464B18930 for ; Thu, 1 Oct 2015 21:30:23 +0000 (UTC) Received: (qmail 72557 invoked by uid 500); 1 Oct 2015 21:30:23 -0000 Delivered-To: apmail-drill-dev-archive@drill.apache.org Received: (qmail 72506 invoked by uid 500); 1 Oct 2015 21:30:22 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 72486 invoked by uid 99); 1 Oct 2015 21:30:22 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Oct 2015 21:30:22 +0000 Received: from mail-wi0-f175.google.com (mail-wi0-f175.google.com [209.85.212.175]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 8327F1A0592 for ; Thu, 1 Oct 2015 21:30:22 +0000 (UTC) Received: by wicfx3 with SMTP id fx3so7284219wic.0 for ; Thu, 01 Oct 2015 14:30:21 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.180.210.162 with SMTP id mv2mr777516wic.47.1443735021263; Thu, 01 Oct 2015 14:30:21 -0700 (PDT) Received: by 10.194.61.74 with HTTP; Thu, 1 Oct 2015 14:30:21 -0700 (PDT) Date: Thu, 1 Oct 2015 14:30:21 -0700 Message-ID: Subject: Partial aggregation in Drill-on-Phoenix From: Julian Hyde To: dev@drill.apache.org Cc: James Taylor , Maryann Xue Content-Type: text/plain; charset=UTF-8 Phoenix is able to perform quite a few relational operations on the region server: scan, filter, project, aggregate, sort (optionally with limit). However, the sort and aggregate are necessarily "local". They can only deal with data on that region server, and there needs to be a further operation to combine the results from the region servers. The question is how to plan such queries. I think the answer is an AggregateExchangeTransposeRule. The rule would spot an Aggregate on a data source that is split into multiple locations (partitions) and split it into a partial Aggregate that computes sub-totals and a summarizing Aggregate that combines those totals. How does the planner know that the Aggregate needs to be split? Since the data's distribution has changed, there would need to be an Exchange operator. It is the Exchange operator that triggers the rule to fire. There are some special cases. If the data is sorted as well as partitioned (say because the local aggregate uses a sort-based algorithm) we could maybe use a more efficient plan. And if the partition key is the same as the aggregation key we don't need a summarizing Aggregate, just a Union. It turns out not to be very Phoenix-specific. In the Drill-on-Phoenix scenario, once the Aggregate has been pushed through the Exchange (i.e. onto the drill-bit residing on the region server) we can then push the DrillAggregate across the drill-to-phoenix membrane and make it into a PhoenixServerAggregate that executes in the region server. Related issues: * https://issues.apache.org/jira/browse/DRILL-3840 * https://issues.apache.org/jira/browse/CALCITE-751 Julian