drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Khurram Faraaz (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-5576) OutOfMemoryException when some CPU cores are taken offline while concurrent queries are under execution
Date Wed, 07 Jun 2017 20:40:18 GMT
Khurram Faraaz created DRILL-5576:
-------------------------------------

             Summary: OutOfMemoryException when some CPU cores are taken offline while concurrent
queries are under execution
                 Key: DRILL-5576
                 URL: https://issues.apache.org/jira/browse/DRILL-5576
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
    Affects Versions: 1.11.0
         Environment: 3 nodes CentOS cluster
            Reporter: Khurram Faraaz


When we reduce the number of available CPU cores while concurrent queries are under execution
we see an OOM.

Drill 1.11.0 commit ID: d11aba2
three node CentOS 6.8 cluster
On each of the nodes Drill's direct memory was set to
export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"16G"}

There are 24 cores on the node where foreman Drillbit is under execution.
{noformat}
[root@centos-01 logs]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                24
On-line CPU(s) list:   0,2,4,5,8,9,12,14,15,18,20,22
Off-line CPU(s) list:  1,3,6,7,10,11,13,16,17,19,21,23
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 44
Model name:            Intel(R) Xeon(R) CPU           E5645  @ 2.40GHz
Stepping:              2
CPU MHz:               1600.000
BogoMIPS:              4799.86
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              12288K
NUMA node0 CPU(s):     0,2,4,5,12,14,15
NUMA node1 CPU(s):     8,9,18,20,22
{noformat}

Java code snippet that creates threads and executes TPC-DS query 11 concurrently
{noformat}
        ExecutorService executor = Executors.newFixedThreadPool(48);
        try {
            for (int i = 1; i <= 48; i++) {
                executor.submit(new ConcurrentQuery(conn));
            }
        } catch (Exception e) {
            System.out.println(e.getMessage());
            e.printStackTrace();
        }
{noformat}

While the TPC-DS Query 11 is under execution using above program, we take half of the available
CPU cores offline
{noformat}
[root@centos-01 ~]# sh turnCPUCoresOffline.sh
OFFLINE cores are :
1,3,6-7,10-11,13,16-17,19,21,23
ONLINE cores are :
0,2,4-5,8-9,12,14-15,18,20,22
{noformat}

The result is we see an OutOfMemoryException, drillbit.log files are attached.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message