Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C5582200C85 for ; Tue, 30 May 2017 23:39:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C3C2B160BC9; Tue, 30 May 2017 21:39:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 17A34160BB1 for ; Tue, 30 May 2017 23:39:08 +0200 (CEST) Received: (qmail 41759 invoked by uid 500); 30 May 2017 21:39:08 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 41748 invoked by uid 99); 30 May 2017 21:39:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 May 2017 21:39:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id B1DAC180516 for ; Tue, 30 May 2017 21:39:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id XKApF2orkghE for ; Tue, 30 May 2017 21:39:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 399A460D61 for ; Tue, 30 May 2017 21:39:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 59A98E0373 for ; Tue, 30 May 2017 21:39:04 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 18CB621B55 for ; Tue, 30 May 2017 21:39:04 +0000 (UTC) Date: Tue, 30 May 2017 21:39:04 +0000 (UTC) From: "BELUGA BEHR (JIRA)" To: dev@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HIVE-16792) Estimate Rows When Joining BIGINT to INT Column MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 30 May 2017 21:39:09 -0000 BELUGA BEHR created HIVE-16792: ---------------------------------- Summary: Estimate Rows When Joining BIGINT to INT Column Key: HIVE-16792 URL: https://issues.apache.org/jira/browse/HIVE-16792 Project: Hive Issue Type: Improvement Affects Versions: 2.1.1 Reporter: BELUGA BEHR Priority: Minor {code:sql} create table test1 (a int); create table test2 (z bigint); INSERT INTO test1 VALUES (1); INSERT INTO test2 VALUES (2147483648); analyze table test1 compute statistics for columns; analyze table test2 compute statistics for columns; EXPLAIN SELECT * FROM test1 t1 INNER JOIN test2 t2 ON t1.a=t2.z; {code} {code} Explain STAGE DEPENDENCIES: Stage-4 is a root stage Stage-3 depends on stages: Stage-4 Stage-0 depends on stages: Stage-3 "" STAGE PLANS: Stage: Stage-4 Map Reduce Local Work Alias -> Map Local Tables: t2 Fetch Operator limit: -1 Alias -> Map Local Operator Tree: t2 TableScan alias: t2 filterExpr: z is not null (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Filter Operator predicate: z is not null (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE HashTable Sink Operator keys: 0 UDFToLong(a) (type: bigint) 1 z (type: bigint) Stage: Stage-3 Map Reduce Map Operator Tree: TableScan alias: t1 filterExpr: UDFToLong(a) is not null (type: boolean) Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: UDFToLong(a) is not null (type: boolean) Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 UDFToLong(a) (type: bigint) 1 z (type: bigint) outputColumnNames: _col0, _col4" Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col4 (type: bigint)" outputColumnNames: _col0, _col1" Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Local Work: Map Reduce Local Work Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {code} I would expect that perhaps Hive would be smart enough to know that this join is not going to produce any rows because the MIN VALUE of table test2 is more than INTEGER.MAX_VALUE. -- This message was sent by Atlassian JIRA (v6.3.15#6346)