Return-Path: X-Original-To: apmail-hive-issues-archive@minotaur.apache.org Delivered-To: apmail-hive-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B0FEF18897 for ; Sun, 29 Nov 2015 23:31:11 +0000 (UTC) Received: (qmail 8959 invoked by uid 500); 29 Nov 2015 23:31:11 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 8930 invoked by uid 500); 29 Nov 2015 23:31:11 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 8911 invoked by uid 99); 29 Nov 2015 23:31:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 29 Nov 2015 23:31:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id EEDEB2C0453 for ; Sun, 29 Nov 2015 23:31:10 +0000 (UTC) Date: Sun, 29 Nov 2015 23:31:10 +0000 (UTC) From: "Jason Dere (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-12535) Dynamic Hash Join: Key references are cyclic MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-12535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031217#comment-15031217 ] Jason Dere commented on HIVE-12535: ----------------------------------- It looks like that output is due to the user-level explain formatting that is done in common/src/java/org/apache/hadoop/hive/common/jsonexplain/tez/Op.java. After the initial plan is created the MapJoin operator looks like this: {noformat} "keys:":{"0":"KEY.reducesinkkey0 (type: int)","1":"KEY.reducesinkkey0 (type: int)"} "input vertices:":{"1":"Map 3"} {noformat} Because input "0" (which I think is the big table in this case) is not in the "input vertices" list, it gets resolved during Op.java as the current vertex ("Reducer 2"). So this issue in the explain output is simply cosmetic, but if there is a similar issue in the vectorizer, it could also be related to the fact that the input vertices for the MapJoin do not include the big table. Tho I'm not sure if whether that mapping is supposed to include the big table, someone else may need to comment on that. > Dynamic Hash Join: Key references are cyclic > -------------------------------------------- > > Key: HIVE-12535 > URL: https://issues.apache.org/jira/browse/HIVE-12535 > Project: Hive > Issue Type: Bug > Components: Query Planning > Affects Versions: 2.0.0 > Reporter: Gopal V > Assignee: Jason Dere > Attachments: philz_26.txt > > > MAPJOIN_4227 is inside "Reducer 2", but refers back to "Reducer 2" in its keys. It should say "Map 1" there. > {code} > | |<-Reducer 2 [SIMPLE_EDGE] vectorized, llap | > | Reduce Output Operator [RS_4189] | > | key expressions:_col0 (type: string), _col1 (type: int) | > | Map-reduce partition columns:_col0 (type: string), _col1 (type: int) | > | sort order:++ | > | Statistics:Num rows: 83 Data size: 9213 Basic stats: COMPLETE Column stats: COMPLETE | > | value expressions:_col2 (type: double) | > | Group By Operator [OP_4229] | > | aggregations:["sum(_col2)"] | > | keys:_col0 (type: string), _col1 (type: int) | > | outputColumnNames:["_col0","_col1","_col2"] | > | Statistics:Num rows: 83 Data size: 9213 Basic stats: COMPLETE Column stats: COMPLETE | > | Select Operator [OP_4228] | > | outputColumnNames:["_col0","_col1","_col2"] | > | Statistics:Num rows: 166 Data size: 26394 Basic stats: COMPLETE Column stats: COMPLETE | > | Map Join Operator [MAPJOIN_4227] | > | | condition map:[{"":"Inner Join 0 to 1"}] | > | | keys:{"Reducer 2":"KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 (type: int), KEY.reducesinkkey2 (type: int)","Map 5":"KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 (type: int), KEY.reducesinkkey2 (type: int)"} | > | | outputColumnNames:["_col1","_col3","_col5"] | > | | Statistics:Num rows: 166 Data size: 26394 Basic stats: COMPLETE Column stats: COMPLETE | > | |<-Map 5 [CUSTOM_SIMPLE_EDGE] vectorized, llap | > | | Reduce Output Operator [RS_4226] | > | | key expressions:_col1 (type: bigint), year(_col2) (type: int), month(_col2) (type: int) | > | | Map-reduce partition columns:_col1 (type: bigint), year(_col2) (type: int), month(_col2) (type: int) | > | | sort order:+++ | > | | Statistics:Num rows: 74973886 Data size: 5098224248 Basic stats: COMPLETE Column stats: COMPLETE | > | | value expressions:_col0 (type: float), _col2 (type: date) | > | | Select Operator [OP_4225] | > | | outputColumnNames:["_col0","_col1","_col2"] | > | | Statistics:Num rows: 74973886 Data size: 5098224248 Basic stats: COMPLETE Column stats: COMPLETE | > | | Filter Operator [FIL_4224] | > | | predicate:((account_id is not null and month(effective_date) BETWEEN 4 AND 7) and month(effective_date) is not null) (type: boolean) | > | | Statistics:Num rows: 74973886 Data size: 5098224248 Basic stats: COMPLETE Column stats: COMPLETE | > | | TableScan [TS_4171] | > | | alias:t | > | | Statistics:Num rows: 149947772 Data size: 10196448496 Basic stats: COMPLETE Column stats: COMPLETE | > | |<-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized, llap | > | Reduce Output Operator [RS_4223] | > | key expressions:_col0 (type: bigint), year(_col2) (type: int), month(_col2) (type: int) | > | Map-reduce partition columns:_col0 (type: bigint), year(_col2) (type: int), month(_col2) (type: int) | > | sort order:+++ | > | Statistics:Num rows: 50289673 Data size: 8197216699 Basic stats: COMPLETE Column stats: COMPLETE | > | value expressions:_col1 (type: string) | > | Map Join Operator [MAPJOIN_4222] | > | | condition map:[{"":"Left Semi Join 0 to 1"}] | > | | keys:{"Map 1":"_col1 (type: string)","Map 4":"_col0 (type: string)"} | > | | outputColumnNames:["_col0","_col1","_col2"] | > | | Statistics:Num rows: 50289673 Data size: 8197216699 Basic stats: COMPLETE Column stats: COMPLETE | > | |<-Map 4 [BROADCAST_EDGE] vectorized, llap | > | | Reduce Output Operator [RS_4179] | > | | key expressions:_col0 (type: string) | > | | Map-reduce partition columns:_col0 (type: string) | > | | sort order:+ | > | | Statistics:Num rows: 1 Data size: 99 Basic stats: COMPLETE Column stats: COMPLETE | > | | Group By Operator [OP_4219] | > | | keys:_col0 (type: string) | > | | outputColumnNames:["_col0"] | > | | Statistics:Num rows: 1 Data size: 99 Basic stats: COMPLETE Column stats: COMPLETE | > | | Select Operator [OP_4218] | > | | outputColumnNames:["_col0"] | > | | Statistics:Num rows: 3 Data size: 297 Basic stats: COMPLETE Column stats: COMPLETE | > | | Filter Operator [FIL_4217] | > | | predicate:(account_type = 'order ahead') (type: boolean) | > | | Statistics:Num rows: 3 Data size: 294 Basic stats: COMPLETE Column stats: COMPLETE | > | | TableScan [TS_4168] | > | | alias:at | > | | Statistics:Num rows: 13 Data size: 1274 Basic stats: COMPLETE Column stats: COMPLETE | > | |<-Select Operator [OP_4221] | > | outputColumnNames:["_col0","_col1","_col2"] | > | Statistics:Num rows: 50289673 Data size: 8197216699 Basic stats: COMPLETE Column stats: COMPLETE | > | Filter Operator [FIL_4220] | > | predicate:(((account_id is not null and (account_type = 'order ahead')) and year(effective_date) is not null) and month(effective_date) is not null) (type: boolean) | > | Statistics:Num rows: 50289673 Data size: 8197216699 Basic stats: COMPLETE Column stats: COMPLETE | > | TableScan [TS_4165] | > | alias:a | > | Statistics:Num rows: 201158695 Data size: 32788867285 Basic stats: COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)