Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 0A33D200ABE for ; Fri, 20 May 2016 19:56:16 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 09084160A24; Fri, 20 May 2016 17:56:16 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 79EE21609AE for ; Fri, 20 May 2016 19:56:15 +0200 (CEST) Received: (qmail 78307 invoked by uid 500); 20 May 2016 17:56:13 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 77990 invoked by uid 99); 20 May 2016 17:56:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 May 2016 17:56:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 449932C1F77 for ; Fri, 20 May 2016 17:56:13 +0000 (UTC) Date: Fri, 20 May 2016 17:56:13 +0000 (UTC) From: "Wei Zheng (JIRA)" To: dev@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HIVE-13809) Hybrid Grace Hash Join memory usage estimation didn't take into account the bloom filter size MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 20 May 2016 17:56:16 -0000 Wei Zheng created HIVE-13809: -------------------------------- Summary: Hybrid Grace Hash Join memory usage estimation didn't take into account the bloom filter size Key: HIVE-13809 URL: https://issues.apache.org/jira/browse/HIVE-13809 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.0.0, 2.1.0 Reporter: Wei Zheng Assignee: Wei Zheng Memory estimation is important during hash table loading, because we need to make the decision of whether to load the next hash partition in memory or spill it. If the assumption is there's enough memory but it turns out not the case, we will run into OOM problem. Currently hybrid grace hash join memory usage estimation didn't take into account the bloom filter size. In large test cases (TB scale) the bloom filter grows as big as hundreds of MB, big enough to cause estimation error. The solution is to count in the bloom filter size into memory estimation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)