drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4446) Improve current fragment parallelization module
Date Thu, 07 Apr 2016 22:39:25 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231269#comment-15231269

ASF GitHub Bot commented on DRILL-4446:

Github user sudheeshkatkam commented on a diff in the pull request:

    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/planner/fragment/TestHardAffinityFragmentParallelizer.java
    @@ -0,0 +1,192 @@
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * <p/>
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * <p/>
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.fragment;
    +import com.google.common.collect.HashMultiset;
    +import com.google.common.collect.ImmutableList;
    +import mockit.Mocked;
    +import mockit.NonStrictExpectations;
    +import org.apache.drill.exec.physical.EndpointAffinity;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
    +import org.junit.Test;
    +import java.util.Collections;
    +import java.util.List;
    +import static org.apache.drill.exec.ExecConstants.SLICE_TARGET_DEFAULT;
    +import static org.apache.drill.exec.planner.fragment.HardAffinityFragmentParallelizer.INSTANCE;
    +import static org.junit.Assert.assertEquals;
    +import static org.junit.Assert.assertTrue;
    +public class TestHardAffinityFragmentParallelizer {
    --- End diff --
    negative tests?

> Improve current fragment parallelization module
> -----------------------------------------------
>                 Key: DRILL-4446
>                 URL: https://issues.apache.org/jira/browse/DRILL-4446
>             Project: Apache Drill
>          Issue Type: New Feature
>    Affects Versions: 1.5.0
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>             Fix For: 1.7.0
> Current fragment parallelizer {{SimpleParallelizer.java}} can’t handle correctly the
case where an operator has mandatory scheduling requirement for a set of DrillbitEndpoints
and affinity for each DrillbitEndpoint (i.e how much portion of the total tasks to be scheduled
on each DrillbitEndpoint). It assumes that scheduling requirements are soft (except one case
where Mux and DeMux case where mandatory parallelization requirement of 1 unit). 
> An example is:
> Cluster has 3 nodes running Drillbits and storage service on each. Data for a table is
only present at storage services in two nodes. So a GroupScan needs to be scheduled on these
two nodes in order to read the data. Storage service doesn't support (or costly) reading data
from remote node.
> Inserting the mandatory scheduling requirements within existing SimpleParallelizer is
not sufficient as you may end up with a plan that has a fragment with two GroupScans each
having its own hard parallelization requirements.
> Proposal is:
> Add a property to each operator which tells what parallelization implementation to use.
Most operators don't have any particular strategy (such as Project or Filter), they depend
on incoming operator. Current existing operators which have requirements (all existing GroupScans)
default to current parallelizer {{SimpleParallelizer}}. {{Screen}} defaults to new mandatory
assignment parallelizer. It is possible that PhysicalPlan generated can have a fragment with
operators having different parallelization strategies. In that case an exchange is inserted
in between operators where a change in parallelization strategy is required.
> Will send a detailed design doc.

This message was sent by Atlassian JIRA

View raw message