Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A48E11025A for ; Mon, 13 Jan 2014 07:13:22 +0000 (UTC) Received: (qmail 80522 invoked by uid 500); 13 Jan 2014 07:11:13 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 80341 invoked by uid 500); 13 Jan 2014 07:10:38 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 80296 invoked by uid 99); 13 Jan 2014 07:10:26 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Jan 2014 07:10:26 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id EF40B1C072C; Mon, 13 Jan 2014 07:10:24 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============3687388988177812516==" MIME-Version: 1.0 Subject: Re: Review Request 16728: Implement non-staged MapJoin From: "Navis Ryu" To: "Navis Ryu" , "hive" Date: Mon, 13 Jan 2014 07:10:24 -0000 Message-ID: <20140113071024.20236.75422@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org Auto-Submitted: auto-generated Sender: "Navis Ryu" X-ReviewGroup: hive X-ReviewRequest-URL: https://reviews.apache.org/r/16728/ X-Sender: "Navis Ryu" References: <20140113044305.20235.94360@reviews.apache.org> In-Reply-To: <20140113044305.20235.94360@reviews.apache.org> Reply-To: "Navis Ryu" X-ReviewRequest-Repository: hive-git --===============3687388988177812516== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16728/ ----------------------------------------------------------- (Updated Jan. 13, 2014, 7:10 a.m.) Review request for hive. Changes ------- Missed a file Bugs: HIVE-6144 https://issues.apache.org/jira/browse/HIVE-6144 Repository: hive-git Description ------- For map join, all data in small aliases are hashed and stored into temporary file in MapRedLocalTask. But for some aliases without filter or projection, it seemed not necessary to do that. For example. {noformat} select a.* from src a join src b on a.key=b.key; {noformat} makes plan like this. {noformat} STAGE PLANS: Stage: Stage-4 Map Reduce Local Work Alias -> Map Local Tables: a Fetch Operator limit: -1 Alias -> Map Local Operator Tree: a TableScan alias: a HashTable Sink Operator condition expressions: 0 {key} {value} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] Position of Big Table: 1 Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: b TableScan alias: b Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {key} {value} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] outputColumnNames: _col0, _col1 Position of Big Table: 1 Select Operator File Output Operator Local Work: Map Reduce Local Work Stage: Stage-0 Fetch Operator {noformat} table src(a) is fetched and stored as-is in MRLocalTask. With this patch, plan can be like below. {noformat} Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: b TableScan alias: b Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {key} {value} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] outputColumnNames: _col0, _col1 Position of Big Table: 1 Select Operator File Output Operator Local Work: Map Reduce Local Work Alias -> Map Local Tables: a Fetch Operator limit: -1 Alias -> Map Local Operator Tree: a TableScan alias: a Has Any Stage Alias: false Stage: Stage-0 Fetch Operator {noformat} Diffs (updated) ----- common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 16d54c6 conf/hive-default.xml.template d188f2a ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java d8f4eb4 ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableLoader.java a080fcc ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java aa8f19c ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 1e0314d ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java bdc85b9 ql/src/java/org/apache/hadoop/hive/ql/exec/TemporaryHashSinkOperator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 5511bca ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HashTableLoader.java efe5710 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 0cc90d0 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 2df8ab9 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java 5a53e15 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java 010ac54 ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java 14fced7 ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java 83a778d ql/src/test/queries/clientpositive/auto_join_without_localtask.q PRE-CREATION ql/src/test/results/clientpositive/auto_join_without_localtask.q.out PRE-CREATION Diff: https://reviews.apache.org/r/16728/diff/ Testing ------- Thanks, Navis Ryu --===============3687388988177812516==--