Return-Path: X-Original-To: apmail-drill-issues-archive@minotaur.apache.org Delivered-To: apmail-drill-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1928418874 for ; Fri, 15 Jan 2016 05:47:40 +0000 (UTC) Received: (qmail 66711 invoked by uid 500); 15 Jan 2016 05:47:40 -0000 Delivered-To: apmail-drill-issues-archive@drill.apache.org Received: (qmail 66690 invoked by uid 500); 15 Jan 2016 05:47:40 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 66681 invoked by uid 99); 15 Jan 2016 05:47:39 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jan 2016 05:47:39 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id C50ED2C14F7 for ; Fri, 15 Jan 2016 05:47:39 +0000 (UTC) Date: Fri, 15 Jan 2016 05:47:39 +0000 (UTC) From: "Victoria Markman (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (DRILL-4266) Possible memory leak (fragmentation ?) in rpc layer MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DRILL-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Victoria Markman updated DRILL-4266: ------------------------------------ Description: I have executed 5 tests from Advanced/mondrian test suite in a loop overnight. My observation is that direct memory steadily grew from 117MB to 1.8GB and remained on that level for 14875 iteration of the tests. My question is: why do 5 queries that were able to execute with 117MB of memory require 1.8GB of memory after 5 hours of execution ? Attached: * Memory used after each test iteration : memComsumption.txt * Log of the framework run: drill.log.2016-01-12-16 * Tests: test.tar Setup: {noformat} Single node 32 core box. DRILL_MAX_DIRECT_MEMORY="4G" DRILL_HEAP="1G" 0: jdbc:drill:schema=dfs> select * from sys.options where status like '%CHANGED%'; +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+ | name | kind | type | status | num_val | string_val | bool_val | float_val | +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+ | planner.enable_decimal_data_type | BOOLEAN | SYSTEM | CHANGED | null | null | true | null | +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+ 1 row selected (1.309 seconds) {noformat} {noformat} Reproduction: * tar xvf test.tar into Functional/test directory * ./run.sh -s Functional/test -g regression -t 180 -n 5 -i 10000000 -m {noformat} This is very similar behavior as Hakim and I observed long time ago with window functions. Now, that new allocator is in place we rerun this test and we see the similar things, and allocator does not seem to think that we have a memory leak. Hence the speculation that memory is leaked in RPC layer. I'm going to reduce planner.width.max_per_node and see if it has any effect on memory allocation (speculating again ...) was: I have executed 5 tests from Advanced/mondrian test suite in a loop overnight. My observation is that direct memory steadily grew from 117MB to 1.8GB and remained on that level for 14875 iteration of the tests. My question is: why do 5 queries that were able to execute with 117MB of memory require 1.8GB of memory after 5 hours of execution ? Attached: * Memory used after each test iteration : memComsumption.txt * Log of the framework run: drill.log.2016-01-12-16 * Tests: test.tar Setup: {noformat} Single node 32 core box. DRILL_MAX_DIRECT_MEMORY="4G" DRILL_HEAP="1G" 0: jdbc:drill:schema=dfs> select * from sys.options where status like '%CHANGED%'; +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+ | name | kind | type | status | num_val | string_val | bool_val | float_val | +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+ | planner.enable_decimal_data_type | BOOLEAN | SYSTEM | CHANGED | null | null | true | null | +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+ 1 row selected (1.309 seconds) {noformat} {noformat} Reproduction: * tar xvd test.tar into Functional/test directory * ./run.sh -s Functional/test -g regression -t 180 -n 5 -i 10000000 -m {noformat} This is very similar behavior as Hakim and I observed long time ago with window functions. Now, that new allocator is in place we rerun this test and we see the similar things, and allocator does not seem to think that we have a memory leak. Hence the speculation that memory is leaked in RPC layer. I'm going to reduce planner.width.max_per_node and see if it has any effect on memory allocation (speculating again ...) > Possible memory leak (fragmentation ?) in rpc layer > ---------------------------------------------------- > > Key: DRILL-4266 > URL: https://issues.apache.org/jira/browse/DRILL-4266 > Project: Apache Drill > Issue Type: Bug > Components: Execution - RPC > Affects Versions: 1.5.0 > Reporter: Victoria Markman > Attachments: drill.log.2016-01-12-16, memComsumption.txt, test.tar > > > I have executed 5 tests from Advanced/mondrian test suite in a loop overnight. > My observation is that direct memory steadily grew from 117MB to 1.8GB and remained on that level for 14875 iteration of the tests. > My question is: why do 5 queries that were able to execute with 117MB of memory require 1.8GB of memory after 5 hours of execution ? > Attached: > * Memory used after each test iteration : memComsumption.txt > * Log of the framework run: drill.log.2016-01-12-16 > * Tests: test.tar > Setup: > {noformat} > Single node 32 core box. > DRILL_MAX_DIRECT_MEMORY="4G" > DRILL_HEAP="1G" > 0: jdbc:drill:schema=dfs> select * from sys.options where status like '%CHANGED%'; > +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+ > | name | kind | type | status | num_val | string_val | bool_val | float_val | > +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+ > | planner.enable_decimal_data_type | BOOLEAN | SYSTEM | CHANGED | null | null | true | null | > +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+ > 1 row selected (1.309 seconds) > {noformat} > {noformat} > Reproduction: > * tar xvf test.tar into Functional/test directory > * ./run.sh -s Functional/test -g regression -t 180 -n 5 -i 10000000 -m > {noformat} > This is very similar behavior as Hakim and I observed long time ago with window functions. Now, that new allocator is in place we rerun this test and we see the similar things, and allocator does not seem to think that we have a memory leak. Hence the speculation that memory is leaked in RPC layer. > I'm going to reduce planner.width.max_per_node and see if it has any effect on memory allocation (speculating again ...) -- This message was sent by Atlassian JIRA (v6.3.4#6332)