drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4266) Possible memory leak (fragmentation ?) in rpc layer
Date Sat, 16 Jan 2016 00:13:39 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102752#comment-15102752
] 

Jacques Nadeau commented on DRILL-4266:
---------------------------------------

I have a patch in DRILL-4131 that may give us some more information here. Once I run some
more tests, maybe you can try it here. It doesn't address allocation but it does have better
metrics. After seeing the growth, it would be interesting to look at the drill.alloc.* metrics
in the web ui to see where the growth is coming from.

I'm running tests on my branch now and will let you know once it is ready to try.

[1] https://github.com/apache/drill/pull/327

> Possible memory leak (fragmentation ?)  in rpc layer
> ----------------------------------------------------
>
>                 Key: DRILL-4266
>                 URL: https://issues.apache.org/jira/browse/DRILL-4266
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - RPC
>    Affects Versions: 1.5.0
>            Reporter: Victoria Markman
>            Assignee: Jacques Nadeau
>         Attachments: drill.log.2016-01-12-16, memComsumption.txt, memComsumption_framework.output_Fri_Jan_15_width_per_node=4.log,
test.tar
>
>
> I have executed 5 tests from Advanced/mondrian test suite in a loop overnight.
> My observation is that direct memory steadily grew from 117MB to 1.8GB and remained on
that level for 14875 iteration of the tests.
> My question is: why do 5 queries that were able to execute with 117MB of memory require
1.8GB of memory after 5 hours of execution ?
> Attached:
> * Memory used after each test iteration : memComsumption.txt
> * Log of the framework run: drill.log.2016-01-12-16
> * Tests: test.tar
> Setup:
> {noformat}
> Single node 32 core box. 
> DRILL_MAX_DIRECT_MEMORY="4G"
> DRILL_HEAP="1G"
> 0: jdbc:drill:schema=dfs> select * from sys.options where status like '%CHANGED%';
> +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+
> |               name                |   kind   |  type   |  status  | num_val  | string_val
 | bool_val  | float_val  |
> +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+
> | planner.enable_decimal_data_type  | BOOLEAN  | SYSTEM  | CHANGED  | null     | null
       | true      | null       |
> +-----------------------------------+----------+---------+----------+----------+-------------+-----------+------------+
> 1 row selected (1.309 seconds)
> {noformat}
> {noformat}
> Reproduction:
> * tar xvf test.tar into Functional/test directory 
> * ./run.sh -s Functional/test -g regression -t 180 -n 5 -i 10000000 -m
> {noformat}
> This is very similar behavior as Hakim and I observed long time ago with window functions.
Now, that new allocator is in place we rerun this test and we see the similar things, and
allocator does not seem to think that we have a memory leak. Hence the speculation that memory
is leaked in RPC layer.
> I'm going to reduce planner.width.max_per_node and see if it has any effect on memory
allocation (speculating again ...)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message