Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2437C200BCB for ; Thu, 10 Nov 2016 06:29:01 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 22EE4160B0F; Thu, 10 Nov 2016 05:29:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 68970160AFA for ; Thu, 10 Nov 2016 06:29:00 +0100 (CET) Received: (qmail 46879 invoked by uid 500); 10 Nov 2016 05:28:59 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 46794 invoked by uid 99); 10 Nov 2016 05:28:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Nov 2016 05:28:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 5AF412C0D55 for ; Thu, 10 Nov 2016 05:28:58 +0000 (UTC) Date: Thu, 10 Nov 2016 05:28:58 +0000 (UTC) From: "Paul Rogers (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-5027) ExternalSortBatch can use excessive memory for very large queries MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 10 Nov 2016 05:29:01 -0000 [ https://issues.apache.org/jira/browse/DRILL-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15653086#comment-15653086 ] Paul Rogers commented on DRILL-5027: ------------------------------------ Inspection reveals that the intermediate spill events do not clean up the previous generation of spill files until the Drillbit exits. That is, if we spill files (a, b, c, d) to produce file e, the first four files remain on disk. In particular the {{mergeAndSpill}} method does not call {{BatchGroup.close()}} for each file that is merged into a new spilled batch. However, the batches are removed from the {{spilledBatchGroups}} list, so that the ESB cannot remove them when the query completes. The files are marked "delete on exit", so they are removed when the Drillbit exits. > ExternalSortBatch can use excessive memory for very large queries > ----------------------------------------------------------------- > > Key: DRILL-5027 > URL: https://issues.apache.org/jira/browse/DRILL-5027 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.8.0 > Reporter: Paul Rogers > Priority: Minor > > The {{ExternalSortBatch}} (ESB) operator sorts data while spilling to disk as needed to operate within a memory budget. > The sort happens in two phases: > 1. Gather the incoming batches from the upstream operator, sort them, and spill to disk as needed. > 2. Merge the "runs" spilled in step 1. > In most cases, the second step should run within the memory available for the first step (which is why severity is only Minor). However, if the query is exceptionally large, then the second step can use excessive memory. > Here is why. > * In step 1, we create a series of n spill files. Each file contains batches of some maximum size, say 20 MB. > * In step 2, we must simultaneously open all the spilled files and read the first batch into memory in preparation for merging. > Suppose that the query has 1 TB of data. Suppose that 10 GB of memory is available. The result will be 1 TB / 1 GB = 1000 spill files. Suppose each batch in each file is 20 MB. A single-pass merge will need 20 MB * 1000 = 20 GB to operate. But, we only have 10 GB. > The typical solution is to perform the merge in multiple phases, with each phase reading and respilling only enough runs that will fit in memory. In the above case, the first phase would make two iterations, merging 500 GB in each using 10 GB of memory. The second phase would make a single iteration to merge the two first phase files. > The result is much disk I/O, and a requirement for sufficient disk spaces to store two sets of spill files (one from phase i-1, another for phase i). But, the query will complete within the memory budgeted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)