Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2689B200C8C for ; Tue, 6 Jun 2017 20:38:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 251CE160BD4; Tue, 6 Jun 2017 18:38:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6B850160BC6 for ; Tue, 6 Jun 2017 20:38:21 +0200 (CEST) Received: (qmail 74568 invoked by uid 500); 6 Jun 2017 18:38:20 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 74559 invoked by uid 99); 6 Jun 2017 18:38:20 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Jun 2017 18:38:20 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 2A4371AFDE1 for ; Tue, 6 Jun 2017 18:38:20 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id MXnCeQLSqBXA for ; Tue, 6 Jun 2017 18:38:19 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 0EC725FCAC for ; Tue, 6 Jun 2017 18:38:19 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 998EEE0D22 for ; Tue, 6 Jun 2017 18:38:18 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 2EBE121E0D for ; Tue, 6 Jun 2017 18:38:18 +0000 (UTC) Date: Tue, 6 Jun 2017 18:38:18 +0000 (UTC) From: "Paul Rogers (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (DRILL-5211) Queries fail due to direct memory fragmentation MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 06 Jun 2017 18:38:22 -0000 [ https://issues.apache.org/jira/browse/DRILL-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5211: ------------------------------- Attachment: (was: ApacheDrillMemoryFragmentationBackground.pdf) > Queries fail due to direct memory fragmentation > ----------------------------------------------- > > Key: DRILL-5211 > URL: https://issues.apache.org/jira/browse/DRILL-5211 > Project: Apache Drill > Issue Type: Bug > Reporter: Paul Rogers > Assignee: Paul Rogers > Fix For: 1.9.0 > > Attachments: ApacheDrillMemoryFragmentationBackground.pdf, ApacheDrillVectorSizeLimits.pdf > > > Consider a test of the external sort as follows: > * Direct memory: 3GB > * Input file: 18 GB, with one Varchar column of 8K width > The sort runs, spilling to disk. Once all data arrives, the sort beings to merge the results. But, to do that, it must first do an intermediate merge. For example, in this sort, there are 190 spill files, but only 19 can be merged at a time. (Each merge file contains 128 MB batches, and only 19 can fit in memory, giving a total footprint of 2.5 GB, well below the 3 GB limit. > Yet, when loading batch xx, Drill fails with an OOM error. At that point, total available direct memory is 3,817,865,216. (Obtained from {{maxMemory}} in the {{Bits}} class in the JDK.) > It appears that Drill wants to allocate 58,257,868 bytes, but the {{totalCapacity}} (again in {{Bits}}) is already 3,800,769,206, causing an OOM. > The problem is that, at this point, the external sort should not ask the system for more memory. The allocator for the external sort is at just 1,192,350,366 before the allocation request. Plenty of spare memory should be available, released when the in-memory batches were spilled to disk prior to merging. Indeed, earlier in the run, the sort had reached a peak memory usage of 2,710,716,416 bytes. This memory should be available for reuse during merging, and is plenty sufficient to fill the particular request in question. -- This message was sent by Atlassian JIRA (v6.3.15#6346)