From issues-return-184054-archive-asf-public=cust-asf.ponee.io@hive.apache.org Wed Apr 1 06:41:04 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id A63FC18064E for ; Wed, 1 Apr 2020 08:41:03 +0200 (CEST) Received: (qmail 94732 invoked by uid 500); 1 Apr 2020 06:41:03 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 94723 invoked by uid 99); 1 Apr 2020 06:41:03 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Apr 2020 06:41:03 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 003A7E3141 for ; Wed, 1 Apr 2020 06:41:02 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 800D17806CB for ; Wed, 1 Apr 2020 06:41:00 +0000 (UTC) Date: Wed, 1 Apr 2020 06:41:00 +0000 (UTC) From: "Zoltan Haindrich (Jira)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HIVE-23082) PK/FK stat rescale doesn't work in some cases MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-23082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-23082: ------------------------------------ Attachment: HIVE-23082.03.patch > PK/FK stat rescale doesn't work in some cases > --------------------------------------------- > > Key: HIVE-23082 > URL: https://issues.apache.org/jira/browse/HIVE-23082 > Project: Hive > Issue Type: Bug > Components: Statistics > Reporter: Zoltan Haindrich > Assignee: Zoltan Haindrich > Priority: Major > Attachments: HIVE-23082.01.patch, HIVE-23082.02.patch, HIVE-23082.03.patch > > > As a result in Joins may retain the original estimate; see MAPJOIN_33 in this plan ; which retained the estimate of SEL_32 > {code} > +----------------------------------------------------+ > | Explain | > +----------------------------------------------------+ > | Plan optimized by CBO. | > | | > | Vertex dependency in root stage | > | Map 1 <- Map 2 (BROADCAST_EDGE) | > | | > | Stage-0 | > | Fetch Operator | > | limit:12 | > | Stage-1 | > | Map 1 vectorized | > | File Output Operator [FS_36] | > | Limit [LIM_35] (rows=12 width=4) | > | Number of rows:12 | > | Select Operator [SEL_34] (rows=5040 width=4) | > | Output:["_col0"] | > | Map Join Operator [MAPJOIN_33] (rows=5040 width=8) | > | Conds:SEL_32._col0=RS_30._col0(Inner) | > | <-Map 2 [BROADCAST_EDGE] vectorized | > | BROADCAST [RS_30] | > | PartitionCols:_col0 | > | Select Operator [SEL_29] (rows=1 width=8) | > | Output:["_col0"] | > | Filter Operator [FIL_28] (rows=1 width=108) | > | predicate:((r_reason_id = 'reason 66') and r_reason_sk is not null) | > | TableScan [TS_3] (rows=2 width=108) | > | default@rx0,reason,Tbl:COMPLETE,Col:COMPLETE,Output:["r_reason_id","r_reason_sk"] | > | <-Select Operator [SEL_32] (rows=5040 width=7) | > | Output:["_col0"] | > | Filter Operator [FIL_31] (rows=5040 width=7) | > | predicate:sr_reason_sk is not null | > | TableScan [TS_0] (rows=5112 width=7) | > | default@sr0,store_returns,Tbl:COMPLETE,Col:COMPLETE,Output:["sr_reason_sk"] | > | | > +----------------------------------------------------+ > {code} > repro: > {code} > set hive.query.results.cache.enabled=false; > set hive.explain.user=true; > drop table if exists default.rx0; > drop table if exists default.sr0; > create table rx0 (r_reason_id string, r_reason_sk bigint); > create table sr0 (sr_reason_sk bigint); > insert into rx0 values ('AAAAAAAAAAAAAAAA',1),('AAAAAAAAGEAAAAAA',70); > insert into sr0 values (NULL),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10), > (11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25), > (26),(27),(28),(29),(30),(31),(32),(33),(34),(35),(36),(37),(38),(39),(40), > (41),(42),(43),(44),(45),(46),(47),(48),(49),(50),(51),(52),(53),(54),(55), > (56),(57),(58),(59),(60),(61),(62),(63),(64),(65),(66),(67),(68),(69),(70); > insert into sr0 select a.* from sr0 a,sr0 b; > -- |sr0| ~ 5112 > explain select 1 > from default.sr0 store_returns , default.rx0 reason > where sr_reason_sk = r_reason_sk > and r_reason_id = 'reason 66' > limit 12; > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)