Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id BDE91200CD0 for ; Tue, 11 Jul 2017 04:53:39 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id BC6DE164F2F; Tue, 11 Jul 2017 02:53:39 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E4197164F2C for ; Tue, 11 Jul 2017 04:53:38 +0200 (CEST) Received: (qmail 40434 invoked by uid 500); 11 Jul 2017 02:53:38 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 40419 invoked by uid 99); 11 Jul 2017 02:53:38 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Jul 2017 02:53:38 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 9532F194FC0 for ; Tue, 11 Jul 2017 02:53:37 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.001 X-Spam-Level: X-Spam-Status: No, score=-100.001 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id S4mTHRVx472m for ; Tue, 11 Jul 2017 02:53:36 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id D7C6C62606 for ; Tue, 11 Jul 2017 02:44:00 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 55DBDE0732 for ; Tue, 11 Jul 2017 02:44:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 10969246AC for ; Tue, 11 Jul 2017 02:44:00 +0000 (UTC) Date: Tue, 11 Jul 2017 02:44:00 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-5616) Hash Agg Spill: OOM while reading irregular varchar data MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 11 Jul 2017 02:53:39 -0000 [ https://issues.apache.org/jira/browse/DRILL-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081545#comment-16081545 ] ASF GitHub Bot commented on DRILL-5616: --------------------------------------- GitHub user Ben-Zvi opened a pull request: https://github.com/apache/drill/pull/871 DRILL-5616: Add memory checks, plus minor metrics changes This PR addresses the problem of OOM when handling irregular input (e.g., grouping keys' sizes varies from 8 to 250 bytes long), mainly by adding memory checks (to trigger spilling more often). This PR also contains some other minor changes. Detail: (1) Added a flag `needToCheckIfSpillIsNeeded` in `checkGroupAndAggrValues()` that is set true whenever we suspect a memory pressure. The cases are: (1.a) When the HashTable put() returned KEY_ADDED_LAST (like the prior code did). (1.b) When the put() allocated memory, other than when adding a new batch. (E.g., hash table doubling, buffer reallocation, etc). (1.c) When put() allocated a new batch, then we allocated a matching aggr batch, and the total memory allocated was larger than the estimated max batch size. (2) With the above change, no longer need to compute space for "all hash tables doubling" (as we'll check after each doubling). So removed `extraMemoryNeededForResize()` from the HashTable. Instead just calculating the Max hash table double, in case one doubles. (Calculate by multiplying sizes etc - see line 1243 in HashAggTemplate). (3) Also changed `plannedBatches` to `Math.max(1, plannedBatches)` because now the memory check may be called when no new batch is planned. (4) When calculating the possible number of partitions (in delayedSetup(), line 401), reduce the overhead factor from 8M to 2M, as now we check memory more often. (This will allow more partitions). (5) In doWork(): Before the estimated batch size was updated when the next() incoming produced a bigger input batch. Now changed this to also apply for next() from the spill file. The reason was this crazy test case, when the first batch from spill was 5M, then the next batch was 27M !! This fix is not perfect, we could still have OOMed there. But if not OOming, this change will expedite spills, and prevent later OOMs. (Maybe with a better sizer we could check the batch when spilling and adjust then; currently the sizer shows zero size on the batch before spilling). Other changes (beyond DRILL-5616): (6) The splitAndTransfer() used originally by the HAG to "move" the key columns actually allocates the offset vector and copies all the values (for varchars, a common key type). This is a memory and time waste !! Probably an old code leftover (from before "realloc" was used; see DRILL-1111). Changed it to a simple transfer() for the normal cases (offset runs from zero to the max expected number). The non-normal cases probably never happen; maybe later should be removed (i.e., undoing the DRILL-1111 fix). Implementation - pass `numPendingOutput` to outputKeys(), and check against this number. (7) Metrics: Renamed RESIZING_TIME to RESIZING_TIME_MS (Chun asked for that) - also for the Hash Join. (8) Metrics: SPILL_MB was set inconsistently - in two places (one only when non-zero, the other did not check, hence sometimes produced zero). Added the non-zero check (line 968). (9) Metrics: Produce the Cycle number even when not spilling (i.e. 0) (Rahul asked for that). (10) For debugging: Added a prefix to the OOM error message (to tell if coming from the HashTable or the Aggr batch). You can merge this pull request into a Git repository by running: $ git pull https://github.com/Ben-Zvi/drill DRILL-5616 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/871.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #871 ---- commit 38bd3fc4d618f5b17935d5e94273511ac367bdbf Author: Boaz Ben-Zvi Date: 2017-07-11T00:53:28Z DRILL-5616: Add memory checks, plus minor metrics changes ---- > Hash Agg Spill: OOM while reading irregular varchar data > -------------------------------------------------------- > > Key: DRILL-5616 > URL: https://issues.apache.org/jira/browse/DRILL-5616 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators > Affects Versions: 1.11.0 > Reporter: Boaz Ben-Zvi > Assignee: Boaz Ben-Zvi > Fix For: 1.11.0 > > Original Estimate: 96h > Remaining Estimate: 96h > > An OOM while aggregating a table of two varchar columns where sizes vary significantly ( about 8 bytes long in average, but 250 bytes max ) > alter session set `planner.width.max_per_node` = 1; > alter session set `planner.memory.max_query_memory_per_node` = 327127360; > select count( * ) from (select max(`filename`) from dfs.`/drill/testdata/hash-agg/data2` group by no_nulls_col, nulls_col) d; > {code} > Error: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. > OOM at Second Phase. Partitions: 2. Estimated batch size: 12255232. Planned batches: 0. Rows spilled so far: 434127447 Memory limit: 163563680 so far allocated: 150601728. > Fragment 1:0 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)