Return-Path: X-Original-To: apmail-hive-issues-archive@minotaur.apache.org Delivered-To: apmail-hive-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2EAE419E8B for ; Fri, 22 Apr 2016 21:33:13 +0000 (UTC) Received: (qmail 40947 invoked by uid 500); 22 Apr 2016 21:33:13 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 40838 invoked by uid 500); 22 Apr 2016 21:33:13 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 40758 invoked by uid 99); 22 Apr 2016 21:33:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Apr 2016 21:33:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id D71192C1F5D for ; Fri, 22 Apr 2016 21:33:12 +0000 (UTC) Date: Fri, 22 Apr 2016 21:33:12 +0000 (UTC) From: "Wei Zheng (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-12837) Better memory estimation/allocation for hybrid grace hash join during hash table loading MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254738#comment-15254738 ] Wei Zheng commented on HIVE-12837: ---------------------------------- [~sershe] Could you please review? > Better memory estimation/allocation for hybrid grace hash join during hash table loading > ---------------------------------------------------------------------------------------- > > Key: HIVE-12837 > URL: https://issues.apache.org/jira/browse/HIVE-12837 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 2.1.0 > Reporter: Wei Zheng > Assignee: Wei Zheng > Attachments: HIVE-12837.1.patch, HIVE-12837.2.patch, HIVE-12837.3.patch, HIVE-12837.4.patch > > > This is to avoid an edge case when the memory available is very little (less than a single write buffer size), and we start loading the hash table. Since the write buffer is lazily allocated, we will easily run out of memory before even checking if we should spill any hash partition. > e.g. > Total memory available: 210 MB > Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB > Size of write buffer: 8 MB (lazy allocation) > Number of hash partitions: 16 > Number of hash partitions created in memory: 13 > Number of hash partitions created on disk: 3 > Available memory left after HybridHashTableContainer initialization: 210-16*13=2MB > Now let's say a row is to be loaded into a hash partition in memory, it will try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM. > Solution is to perform the check for possible spilling earlier so we can spill partitions if memory is about to be full, to avoid OOM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)