drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1457) Limit operator optimization : push limit operator past exchange operator; disable parallel plan if no order is required.
Date Fri, 25 Sep 2015 19:57:04 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908580#comment-14908580
] 

ASF GitHub Bot commented on DRILL-1457:
---------------------------------------

Github user hsuanyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/169#discussion_r40469361
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/LimitUnionExchangeTransposeRule.java
---
    @@ -0,0 +1,64 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.physical;
    +
    +import org.apache.calcite.plan.RelOptRule;
    +import org.apache.calcite.plan.RelOptRuleCall;
    +import org.apache.calcite.rel.RelNode;
    +import org.apache.calcite.rex.RexLiteral;
    +import org.apache.calcite.rex.RexNode;
    +import org.apache.drill.exec.planner.logical.RelOptHelper;
    +
    +import java.math.BigDecimal;
    +
    +public class LimitUnionExchangeTransposeRule extends Prule{
    +  public static final RelOptRule INSTANCE = new LimitUnionExchangeTransposeRule();
    +
    +  private LimitUnionExchangeTransposeRule() {
    +    super(RelOptHelper.some(LimitPrel.class, RelOptHelper.any(UnionExchangePrel.class)),
"LimitUnionExchangeTransposeRule");
    +  }
    +
    +  @Override
    +  public boolean matches(RelOptRuleCall call) {
    +    final LimitPrel limit = (LimitPrel) call.rel(0);
    +
    +    return !limit.isPushDown();
    +  }
    +
    +  @Override
    +  public void onMatch(RelOptRuleCall call) {
    +    final LimitPrel limit = (LimitPrel) call.rel(0);
    +    final UnionExchangePrel unionExchangePrel = (UnionExchangePrel) call.rel(1);
    +
    +    RelNode child = unionExchangePrel.getInput();
    +
    +    final int offset = limit.getOffset() != null ? Math.max(0, RexLiteral.intValue(limit.getOffset()))
: 0;
    +    final int fetch = limit.getFetch() != null?  Math.max(0, RexLiteral.intValue(limit.getFetch()))
: 0;
    --- End diff --
    
    Cool! I read that one. Thanks!


> Limit operator optimization : push limit operator past exchange operator; disable parallel
plan if no order is required.
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-1457
>                 URL: https://issues.apache.org/jira/browse/DRILL-1457
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Jinfeng Ni
>            Assignee: Jinfeng Ni
>            Priority: Critical
>             Fix For: 1.2.0
>
>         Attachments: 0001-DRILL-1457-Push-Limit-past-through-UnionExchange.patch
>
>
> When there is LIMIT clause in a query, we would want to push down the LIMIT operator
as much as possible, so that the upstream operator will stop execution once the desired number
of rows are fetched.
> Within one execution fragment, Drill applies a pull model. In many cases, there would
be no performance impact if LIMIT operator is not pushed down, since LIMIT would inform the
upstream operators to stop. However, in multiple fragments, Drill use a push model.  if LIMIT
is not pushed past the exchange operator, and the upstream fragment would continue the execution,
until it receives a notice from downstream fragment, even if LIMIT operator has already got
the required # of rows.
> For instance:
> explain plan for select * from dfs.`/Users/jni/work/tpch-data/tpch-sf10/lineitem` limit
1;
> +------------+------------+
> | 00-00    Screen
> 00-01      SelectionVectorRemover
> 00-02        Limit(fetch=[1])
> 00-03          UnionExchange
> 01-01            Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=file:/Users/jni/work/tpch-data/tpch-sf10/lineitem]],
selectionRoot=/Users/jni/work/tpch-data/tpch-sf10/lineitem, columns=[SchemaPath [`*`]]]])
> The query profile shows Scan operator fetches much more records than desired:
> Minor Fragment	Start	End	Total Time	Max Records	Max Batches
> 01-00-xx	0.507	1.059	0.552	43688	8
> 01-01-xx	0.570	1.054	0.484	27305	5
> 01-02-xx	0.617	1.038	0.421	16383	3
> 01-03-xx	0.668	1.056	0.388	10922	2
> 01-04-xx	0.740	1.055	0.315	10922	2
> 01-05-xx	0.813	1.057	0.244	5461	1
> In the above plan,  there would be two choices for performance optimization:
> 1) push the LIMIT operator past through EXCHANGE operator, ideally into SCAN operator.

> 2) Disable the parallel plan by removing EXCHANGE operator.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message