Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 89942160C18 for ; Wed, 3 Jan 2018 09:42:08 +0100 (CET) Received: (qmail 70585 invoked by uid 500); 3 Jan 2018 08:42:07 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 70562 invoked by uid 99); 3 Jan 2018 08:42:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jan 2018 08:42:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 03DE9C2E44 for ; Wed, 3 Jan 2018 08:42:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.211 X-Spam-Level: X-Spam-Status: No, score=-99.211 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 5bMWzUNdX6DI for ; Wed, 3 Jan 2018 08:42:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id AD3705F2F0 for ; Wed, 3 Jan 2018 08:42:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A756FE023C for ; Wed, 3 Jan 2018 08:42:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 5C898240EE for ; Wed, 3 Jan 2018 08:42:00 +0000 (UTC) Date: Wed, 3 Jan 2018 08:42:00 +0000 (UTC) From: "wan kun (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Assigned] (HIVE-18362) Introduce a parameter to control the max row number for map join convertion MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 03 Jan 2018 08:42:09 -0000 [ https://issues.apache.org/jira/browse/HIVE-18362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wan kun reassigned HIVE-18362: ------------------------------ > Introduce a parameter to control the max row number for map join convertion > --------------------------------------------------------------------------- > > Key: HIVE-18362 > URL: https://issues.apache.org/jira/browse/HIVE-18362 > Project: Hive > Issue Type: Bug > Components: Query Processor > Reporter: wan kun > Assignee: wan kun > Priority: Minor > > The compression ratio of the Orc compressed file will be very high in some cases. > The test table has three Int columns, with twelve million records, but the compressed file size is only 4M. Hive will automatically converts the Join to Map join, but this will cause memory overflow. So I think it is better to have a parameter to limit to the total number of table records in the Map Join convertion, and if the total number of records is larger than that, it can not be converted to Map join. > *hive.auto.convert.join.max.number = 2500000L* > The default value for this parameter is 2500000, because so many records occupy about 700M memory in clint JVM, and 2500000 records for Map Join are also large tables. -- This message was sent by Atlassian JIRA (v6.4.14#64029)