Return-Path: X-Original-To: apmail-drill-issues-archive@minotaur.apache.org Delivered-To: apmail-drill-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 19F3D176B3 for ; Fri, 25 Sep 2015 16:46:05 +0000 (UTC) Received: (qmail 56848 invoked by uid 500); 25 Sep 2015 16:46:04 -0000 Delivered-To: apmail-drill-issues-archive@drill.apache.org Received: (qmail 56680 invoked by uid 500); 25 Sep 2015 16:46:04 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 56569 invoked by uid 99); 25 Sep 2015 16:46:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Sep 2015 16:46:04 +0000 Date: Fri, 25 Sep 2015 16:46:04 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-1457) Limit operator optimization : push limit operator past exchange operator; disable parallel plan if no order is required. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DRILL-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908300#comment-14908300 ] ASF GitHub Bot commented on DRILL-1457: --------------------------------------- Github user hsuanyi commented on a diff in the pull request: https://github.com/apache/drill/pull/169#discussion_r40449942 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/LimitUnionExchangeTransposeRule.java --- @@ -0,0 +1,64 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.planner.physical; + +import org.apache.calcite.plan.RelOptRule; +import org.apache.calcite.plan.RelOptRuleCall; +import org.apache.calcite.rel.RelNode; +import org.apache.calcite.rex.RexLiteral; +import org.apache.calcite.rex.RexNode; +import org.apache.drill.exec.planner.logical.RelOptHelper; + +import java.math.BigDecimal; + +public class LimitUnionExchangeTransposeRule extends Prule{ + public static final RelOptRule INSTANCE = new LimitUnionExchangeTransposeRule(); + + private LimitUnionExchangeTransposeRule() { + super(RelOptHelper.some(LimitPrel.class, RelOptHelper.any(UnionExchangePrel.class)), "LimitUnionExchangeTransposeRule"); + } + + @Override + public boolean matches(RelOptRuleCall call) { + final LimitPrel limit = (LimitPrel) call.rel(0); + + return !limit.isPushDown(); + } + + @Override + public void onMatch(RelOptRuleCall call) { + final LimitPrel limit = (LimitPrel) call.rel(0); + final UnionExchangePrel unionExchangePrel = (UnionExchangePrel) call.rel(1); + + RelNode child = unionExchangePrel.getInput(); + + final int offset = limit.getOffset() != null ? Math.max(0, RexLiteral.intValue(limit.getOffset())) : 0; + final int fetch = limit.getFetch() != null? Math.max(0, RexLiteral.intValue(limit.getFetch())) : 0; --- End diff -- Overall, the approach seems general enough that it is applicable to more than just union-exchange. Can we exploit that ? > Limit operator optimization : push limit operator past exchange operator; disable parallel plan if no order is required. > ------------------------------------------------------------------------------------------------------------------------ > > Key: DRILL-1457 > URL: https://issues.apache.org/jira/browse/DRILL-1457 > Project: Apache Drill > Issue Type: Bug > Reporter: Jinfeng Ni > Assignee: Jinfeng Ni > Priority: Critical > Fix For: 1.2.0 > > Attachments: 0001-DRILL-1457-Push-Limit-past-through-UnionExchange.patch > > > When there is LIMIT clause in a query, we would want to push down the LIMIT operator as much as possible, so that the upstream operator will stop execution once the desired number of rows are fetched. > Within one execution fragment, Drill applies a pull model. In many cases, there would be no performance impact if LIMIT operator is not pushed down, since LIMIT would inform the upstream operators to stop. However, in multiple fragments, Drill use a push model. if LIMIT is not pushed past the exchange operator, and the upstream fragment would continue the execution, until it receives a notice from downstream fragment, even if LIMIT operator has already got the required # of rows. > For instance: > explain plan for select * from dfs.`/Users/jni/work/tpch-data/tpch-sf10/lineitem` limit 1; > +------------+------------+ > | 00-00 Screen > 00-01 SelectionVectorRemover > 00-02 Limit(fetch=[1]) > 00-03 UnionExchange > 01-01 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=file:/Users/jni/work/tpch-data/tpch-sf10/lineitem]], selectionRoot=/Users/jni/work/tpch-data/tpch-sf10/lineitem, columns=[SchemaPath [`*`]]]]) > The query profile shows Scan operator fetches much more records than desired: > Minor Fragment Start End Total Time Max Records Max Batches > 01-00-xx 0.507 1.059 0.552 43688 8 > 01-01-xx 0.570 1.054 0.484 27305 5 > 01-02-xx 0.617 1.038 0.421 16383 3 > 01-03-xx 0.668 1.056 0.388 10922 2 > 01-04-xx 0.740 1.055 0.315 10922 2 > 01-05-xx 0.813 1.057 0.244 5461 1 > In the above plan, there would be two choices for performance optimization: > 1) push the LIMIT operator past through EXCHANGE operator, ideally into SCAN operator. > 2) Disable the parallel plan by removing EXCHANGE operator. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)