drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1942) Improve off-heap memory usage tracking
Date Wed, 09 Sep 2015 23:52:46 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14737826#comment-14737826
] 

ASF GitHub Bot commented on DRILL-1942:
---------------------------------------

Github user jaltekruse commented on a diff in the pull request:

    https://github.com/apache/drill/pull/105#discussion_r39111529
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributedConcurrent.java
---
    @@ -0,0 +1,199 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill;
    +
    +import java.io.IOException;
    +import java.util.Random;
    +import java.util.Set;
    +import java.util.concurrent.Semaphore;
    +
    +import org.apache.drill.QueryTestUtil;
    +import org.apache.drill.common.exceptions.UserException;
    +import org.apache.drill.common.util.TestTools;
    +import org.apache.drill.exec.proto.UserBitShared;
    +import org.apache.drill.exec.proto.UserBitShared.QueryResult.QueryState;
    +import org.apache.drill.exec.rpc.user.UserResultsListener;
    +import org.junit.Rule;
    +import org.junit.Test;
    +import org.junit.rules.TestRule;
    +
    +import com.google.common.collect.Sets;
    +
    +import static org.junit.Assert.assertEquals;
    +import static org.junit.Assert.assertTrue;
    +import static org.junit.Assert.fail;
    +
    +/*
    + * Note that the real interest here is that the drillbit doesn't become
    + * unstable from running a lot of queries concurrently -- it's not about
    + * any particular order of execution. We ignore the results.
    + */
    +public class TestTpchDistributedConcurrent extends BaseTestQuery {
    +  /*
    +   * Longer timeout than usual.
    +   *
    +   * If the test does fail due to a timeout, see the comment in
    +   * ChainingResultListener.queryCompleted() before assuming this
    +   * needs to be adjusted.
    +   */
    +  @Rule public final TestRule TIMEOUT = TestTools.getTimeoutRule(120000);
    +
    +  /*
    +   * Valid test names taken from TestTpchDistributed. Fuller path prefixes are
    +   * used so that tests may also be taken from other locations -- more variety
    +   * is better as far as this test goes.
    +   */
    +  private final static String queryFile[] = {
    +    "queries/tpch/01.sql",
    +    "queries/tpch/03.sql",
    +    "queries/tpch/04.sql",
    +    "queries/tpch/05.sql",
    +    "queries/tpch/06.sql",
    +    "queries/tpch/07.sql",
    +    "queries/tpch/08.sql",
    +    "queries/tpch/09.sql",
    +    "queries/tpch/10.sql",
    +    "queries/tpch/11.sql",
    +    "queries/tpch/12.sql",
    +    "queries/tpch/13.sql",
    +    "queries/tpch/14.sql",
    +    // "queries/tpch/15.sql", this creates a view
    +    "queries/tpch/16.sql",
    +    "queries/tpch/18.sql",
    +    "queries/tpch/19_1.sql",
    +    "queries/tpch/20.sql",
    +  };
    +
    +  private final static int TOTAL_QUERIES = 115;
    +  private final static int CONCURRENT_QUERIES = 15;
    +
    +  private final static Random random = new Random(0xdeadbeef); // Use the same seed each
time.
    +  private final static String alterSession = "alter session set `planner.slice_target`
= 10";
    +
    +  private int remainingQueries = TOTAL_QUERIES - CONCURRENT_QUERIES;
    +  private final Semaphore completionSemaphore = new Semaphore(0);
    +  private final Semaphore submissionSemaphore = new Semaphore(0);
    +  private final Set<UserResultsListener> listeners = Sets.newIdentityHashSet();
    +
    +  private void submitRandomQuery() {
    +    final String filename = queryFile[random.nextInt(queryFile.length)];
    +    final String query;
    +    try {
    +      query = QueryTestUtil.normalizeQuery(getFile(filename)).replace(';', ' ');
    +    } catch(IOException e) {
    +      throw new RuntimeException("Caught exception", e);
    +    }
    +    final UserResultsListener listener = new ChainingSilentListener(query);
    +    client.runQuery(UserBitShared.QueryType.SQL, query, listener);
    +    synchronized(listeners) {
    +      listeners.add(listener);
    +    }
    +  }
    +
    +  private class ChainingSilentListener extends SilentListener {
    +    private final String query;
    +
    +    public ChainingSilentListener(final String query) {
    +      this.query = query;
    +    }
    +
    +    @Override
    +    public void queryCompleted(QueryState state) {
    +      super.queryCompleted(state);
    +
    +      final boolean removed;
    +      synchronized(listeners) {
    +        removed = listeners.remove(this);
    +      }
    +
    +      /*
    --- End diff --
    
    Above there is this comment  "If the test does fail due to a timeout, see the comment
in ChainingResultListener.queryCompleted() before assuming this needs to be adjusted."
    
    I figured these were both referring to the same timeout, in which case I was trying to
suggest we record the failure somewhere and check for it when we go to start the next query.
Rather than throw an exception here, let the RPC system eat it and get stuck and let the test
fail due to a timeout.


> Improve off-heap memory usage tracking
> --------------------------------------
>
>                 Key: DRILL-1942
>                 URL: https://issues.apache.org/jira/browse/DRILL-1942
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Relational Operators
>            Reporter: Chris Westin
>            Assignee: Chris Westin
>             Fix For: 1.2.0
>
>         Attachments: DRILL-1942.1.patch.txt, DRILL-1942.2.patch.txt, DRILL-1942.3.patch.txt
>
>
> We're using a lot more memory than we think we should. We may be leaking it, or not releasing
it as soon as we could. 
> This is a call to come up with some improved tracking so that we can get statistics out
about exactly where we're using it, and whether or not we can release it earlier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message