Return-Path: Delivered-To: apmail-hadoop-hive-commits-archive@minotaur.apache.org Received: (qmail 7134 invoked from network); 29 Jun 2009 19:33:58 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Jun 2009 19:33:58 -0000 Received: (qmail 12219 invoked by uid 500); 29 Jun 2009 19:34:09 -0000 Delivered-To: apmail-hadoop-hive-commits-archive@hadoop.apache.org Received: (qmail 12184 invoked by uid 500); 29 Jun 2009 19:34:09 -0000 Mailing-List: contact hive-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-dev@hadoop.apache.org Delivered-To: mailing list hive-commits@hadoop.apache.org Received: (qmail 12174 invoked by uid 99); 29 Jun 2009 19:34:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Jun 2009 19:34:08 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Jun 2009 19:34:05 +0000 Received: by eris.apache.org (Postfix, from userid 65534) id 17DEE2388893; Mon, 29 Jun 2009 19:33:45 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r789416 [1/2] - in /hadoop/hive/trunk: ./ common/src/java/org/apache/hadoop/hive/conf/ conf/ ql/src/java/org/apache/hadoop/hive/ql/exec/ ql/src/java/org/apache/hadoop/hive/ql/optimizer/ ql/src/java/org/apache/hadoop/hive/ql/parse/ ql/src/te... Date: Mon, 29 Jun 2009 19:33:44 -0000 To: hive-commits@hadoop.apache.org From: zshao@apache.org X-Mailer: svnmailer-1.0.8 Message-Id: <20090629193345.17DEE2388893@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: zshao Date: Mon Jun 29 19:33:43 2009 New Revision: 789416 URL: http://svn.apache.org/viewvc?rev=789416&view=rev Log: HIVE-530. Map Join followup: optimize number of map-reduce jobs. (Namit Jain via zshao) Modified: hadoop/hive/trunk/CHANGES.txt hadoop/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java hadoop/hive/trunk/conf/hive-default.xml hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRProcContext.java hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinFactory.java hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java hadoop/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestOperators.java hadoop/hive/trunk/ql/src/test/results/clientpositive/join25.q.out hadoop/hive/trunk/ql/src/test/results/clientpositive/join26.q.out hadoop/hive/trunk/ql/src/test/results/clientpositive/join27.q.out hadoop/hive/trunk/ql/src/test/results/clientpositive/join28.q.out hadoop/hive/trunk/ql/src/test/results/clientpositive/join29.q.out hadoop/hive/trunk/ql/src/test/results/clientpositive/join32.q.out hadoop/hive/trunk/ql/src/test/results/clientpositive/join34.q.out hadoop/hive/trunk/ql/src/test/results/clientpositive/join35.q.out hadoop/hive/trunk/ql/src/test/results/clientpositive/join36.q.out Modified: hadoop/hive/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/CHANGES.txt?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/CHANGES.txt (original) +++ hadoop/hive/trunk/CHANGES.txt Mon Jun 29 19:33:43 2009 @@ -92,6 +92,9 @@ HIVE-516. Enable predicate pushdown for junit tests. (Prasad Chakka via zshao) + HIVE-530. Map Join followup: optimize number of map-reduce jobs. + (Namit Jain via zshao) + OPTIMIZATIONS HIVE-279. Predicate Pushdown support (Prasad Chakka via athusoo). Modified: hadoop/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java (original) +++ hadoop/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java Mon Jun 29 19:33:43 2009 @@ -142,7 +142,9 @@ HIVEMERGEMAPFILES("hive.merge.mapfiles", true), HIVEMERGEMAPFILESSIZE("hive.merge.size.per.mapper", (long)(1000*1000*1000)), - + + HIVESENDHEARTBEAT("hive.heartbeat.interval", 1000), + // Optimizer HIVEOPTPPD("hive.optimize.ppd", false); // predicate pushdown Modified: hadoop/hive/trunk/conf/hive-default.xml URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/conf/hive-default.xml?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/conf/hive-default.xml (original) +++ hadoop/hive/trunk/conf/hive-default.xml Mon Jun 29 19:33:43 2009 @@ -219,6 +219,12 @@ + hive.heartbeat.interval + 1000 + Send a heartbeat after this interval - used by mapjoin and filter operators + + + hive.merge.size.per.mapper 1000000000 Size of merged files at the end of the job Modified: hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java (original) +++ hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java Mon Jun 29 19:33:43 2009 @@ -20,6 +20,7 @@ import java.io.Serializable; +import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.plan.filterDesc; @@ -40,16 +41,21 @@ transient private final LongWritable filtered_count, passed_count; transient private ExprNodeEvaluator conditionEvaluator; transient private PrimitiveObjectInspector conditionInspector; + transient private int consecutiveFails; + transient int heartbeatInterval; public FilterOperator () { super(); filtered_count = new LongWritable(); passed_count = new LongWritable(); + consecutiveFails = 0; } public void initializeOp(Configuration hconf, Reporter reporter, ObjectInspector[] inputObjInspector) throws HiveException { - + this.reporter = reporter; + try { + heartbeatInterval = HiveConf.getIntVar(hconf, HiveConf.ConfVars.HIVESENDHEARTBEAT); this.conditionEvaluator = ExprNodeEvaluatorFactory.get(conf.getPredicate()); statsMap.put(Counter.FILTERED, filtered_count); statsMap.put(Counter.PASSED, passed_count); @@ -70,8 +76,14 @@ if (Boolean.TRUE.equals(ret)) { forward(row, rowInspector); passed_count.set(passed_count.get()+1); + consecutiveFails = 0; } else { filtered_count.set(filtered_count.get()+1); + consecutiveFails++; + + // In case of a lot of consecutive failures, send a heartbeat in order to avoid timeout + if ((consecutiveFails % heartbeatInterval) == 0) + reporter.progress(); } } Modified: hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java (original) +++ hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java Mon Jun 29 19:33:43 2009 @@ -119,12 +119,19 @@ transient int metadataKeyTag; transient int[] metadataValueTag; transient List hTables; + transient int numMapRowsRead; + transient int heartbeatInterval; @Override public void initializeOp(Configuration hconf, Reporter reporter, ObjectInspector[] inputObjInspector) throws HiveException { super.initializeOp(hconf, reporter, inputObjInspector); + this.reporter=reporter; + numMapRowsRead = 0; + firstRow = true; try { + heartbeatInterval = HiveConf.getIntVar(hconf, HiveConf.ConfVars.HIVESENDHEARTBEAT); + joinKeys = new HashMap>(); populateJoinKeyValue(joinKeys, conf.getKeys()); @@ -228,6 +235,11 @@ firstRow = false; } + // Send some status perodically + numMapRowsRead++; + if ((numMapRowsRead % heartbeatInterval) == 0) + reporter.progress(); + HTree hashTable = mapJoinTables.get(alias); MapJoinObjectKey keyMap = new MapJoinObjectKey(metadataKeyTag, key); MapJoinObjectValue o = (MapJoinObjectValue)hashTable.get(keyMap); Modified: hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRProcContext.java URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRProcContext.java?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRProcContext.java (original) +++ hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRProcContext.java Mon Jun 29 19:33:43 2009 @@ -28,6 +28,7 @@ import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.hive.ql.exec.Operator; import org.apache.hadoop.hive.ql.exec.FileSinkOperator; +import org.apache.hadoop.hive.ql.exec.SelectOperator; import org.apache.hadoop.hive.ql.exec.UnionOperator; import org.apache.hadoop.hive.ql.exec.MapJoinOperator; import org.apache.hadoop.hive.ql.exec.Task; Modified: hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinFactory.java URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinFactory.java?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinFactory.java (original) +++ hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinFactory.java Mon Jun 29 19:33:43 2009 @@ -167,13 +167,23 @@ @Override public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx, Object... nodeOutputs) throws SemanticException { - + SelectOperator sel = (SelectOperator)nd; MapJoinOperator mapJoin = (MapJoinOperator)sel.getParentOperators().get(0); assert sel.getParentOperators().size() == 1; - + GenMRProcContext ctx = (GenMRProcContext)procCtx; ParseContext parseCtx = ctx.getParseCtx(); + + // is the mapjoin followed by a reducer + List listMapJoinOps = parseCtx.getListMapJoinOpsNoReducer(); + + if (listMapJoinOps.contains(mapJoin)) { + ctx.setCurrAliasId(null); + ctx.setCurrTopOp(null); + return null; + } + ctx.setCurrMapJoinOp(mapJoin); Task currTask = ctx.getCurrTask(); Modified: hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java (original) +++ hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java Mon Jun 29 19:33:43 2009 @@ -22,12 +22,12 @@ import java.util.ArrayList; import java.util.HashMap; import java.util.Iterator; +import java.util.LinkedHashMap; import java.util.List; import java.util.Map; import java.util.Set; -import java.util.Vector; +import java.util.Stack; -import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.hive.ql.exec.ColumnInfo; import org.apache.hadoop.hive.ql.exec.JoinOperator; import org.apache.hadoop.hive.ql.exec.MapJoinOperator; @@ -37,16 +37,21 @@ import org.apache.hadoop.hive.ql.exec.ReduceSinkOperator; import org.apache.hadoop.hive.ql.exec.RowSchema; import org.apache.hadoop.hive.ql.exec.SelectOperator; -import org.apache.hadoop.hive.ql.parse.ASTNode; +import org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher; +import org.apache.hadoop.hive.ql.lib.Dispatcher; +import org.apache.hadoop.hive.ql.lib.GraphWalker; +import org.apache.hadoop.hive.ql.lib.NodeProcessor; +import org.apache.hadoop.hive.ql.lib.NodeProcessorCtx; +import org.apache.hadoop.hive.ql.lib.Rule; +import org.apache.hadoop.hive.ql.lib.RuleRegExp; import org.apache.hadoop.hive.ql.parse.ErrorMsg; +import org.apache.hadoop.hive.ql.parse.GenMapRedWalker; import org.apache.hadoop.hive.ql.parse.OpParseContext; import org.apache.hadoop.hive.ql.parse.ParseContext; import org.apache.hadoop.hive.ql.parse.QBJoinTree; import org.apache.hadoop.hive.ql.parse.RowResolver; -import org.apache.hadoop.hive.ql.parse.SemanticAnalyzer; import org.apache.hadoop.hive.ql.parse.SemanticException; import org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory; -import org.apache.hadoop.hive.ql.parse.joinCond; import org.apache.hadoop.hive.ql.plan.PlanUtils; import org.apache.hadoop.hive.ql.plan.exprNodeColumnDesc; import org.apache.hadoop.hive.ql.plan.exprNodeDesc; @@ -56,6 +61,7 @@ import org.apache.hadoop.hive.ql.plan.tableDesc; import org.apache.hadoop.hive.ql.plan.joinDesc; import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo; +import org.apache.hadoop.hive.ql.lib.Node; /** * Implementation of one of the rule-based map join optimization. User passes hints to specify map-joins and during this optimization, @@ -86,7 +92,7 @@ * @param qbJoin qb join tree * @param mapJoinPos position of the source to be read as part of map-reduce framework. All other sources are cached in memory */ - private void convertMapJoin(ParseContext pctx, JoinOperator op, QBJoinTree joinTree, int mapJoinPos) throws SemanticException { + private MapJoinOperator convertMapJoin(ParseContext pctx, JoinOperator op, QBJoinTree joinTree, int mapJoinPos) throws SemanticException { // outer join cannot be performed on a table which is being cached joinDesc desc = op.getConf(); org.apache.hadoop.hive.ql.plan.joinCond[] condns = desc.getConds(); @@ -255,6 +261,7 @@ // create a dummy select to select all columns genSelectPlan(pctx, mapJoinOp); + return mapJoinOp; } private void genSelectPlan(ParseContext pctx, MapJoinOperator input) throws SemanticException { @@ -340,7 +347,8 @@ */ public ParseContext transform(ParseContext pactx) throws SemanticException { this.pGraphContext = pactx; - + List listMapJoinOps = new ArrayList(); + // traverse all the joins and convert them if necessary if (pGraphContext.getJoinContext() != null) { Map joinMap = new HashMap(); @@ -353,7 +361,7 @@ QBJoinTree qbJoin = joinEntry.getValue(); int mapJoinPos = mapSideJoin(joinOp, qbJoin); if (mapJoinPos >= 0) { - convertMapJoin(pactx, joinOp, qbJoin, mapJoinPos); + listMapJoinOps.add(convertMapJoin(pactx, joinOp, qbJoin, mapJoinPos)); } else { joinMap.put(joinOp, qbJoin); @@ -364,6 +372,174 @@ pGraphContext.setJoinContext(joinMap); } + // Go over the list and find if a reducer is not needed + List listMapJoinOpsNoRed = new ArrayList(); + + // create a walker which walks the tree in a DFS manner while maintaining the operator stack. + // The dispatcher generates the plan from the operator tree + Map opRules = new LinkedHashMap(); + opRules.put(new RuleRegExp(new String("R0"), "MAPJOIN%"), getCurrentMapJoin()); + opRules.put(new RuleRegExp(new String("R1"), "MAPJOIN%.*FS%"), getMapJoinFS()); + opRules.put(new RuleRegExp(new String("R2"), "MAPJOIN%.*RS%"), getMapJoinDefault()); + opRules.put(new RuleRegExp(new String("R3"), "MAPJOIN%.*MAPJOIN%"), getMapJoinDefault()); + opRules.put(new RuleRegExp(new String("R4"), "MAPJOIN%.*UNION%"), getMapJoinDefault()); + + // The dispatcher fires the processor corresponding to the closest matching rule and passes the context along + Dispatcher disp = new DefaultRuleDispatcher(getDefault(), opRules, new MapJoinWalkerCtx(listMapJoinOpsNoRed)); + + GraphWalker ogw = new GenMapRedWalker(disp); + ArrayList topNodes = new ArrayList(); + topNodes.addAll(listMapJoinOps); + ogw.startWalking(topNodes, null); + + pGraphContext.setListMapJoinOpsNoReducer(listMapJoinOpsNoRed); return pGraphContext; } + + public static class CurrentMapJoin implements NodeProcessor { + + /** + * Store the current mapjoin in the context + */ + @Override + public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx, + Object... nodeOutputs) throws SemanticException { + + MapJoinWalkerCtx ctx = (MapJoinWalkerCtx)procCtx; + MapJoinOperator mapJoin = (MapJoinOperator)nd; + ctx.setCurrMapJoinOp(mapJoin); + return null; + } + } + + public static class MapJoinFS implements NodeProcessor { + + /** + * Store the current mapjoin in a list of mapjoins followed by a filesink + */ + @Override + public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx, + Object... nodeOutputs) throws SemanticException { + + MapJoinWalkerCtx ctx = (MapJoinWalkerCtx)procCtx; + MapJoinOperator mapJoin = ctx.getCurrMapJoinOp(); + List listRejectedMapJoins = ctx.getListRejectedMapJoins(); + + // the mapjoin has already been handled + if ((listRejectedMapJoins != null) && + (listRejectedMapJoins.contains(mapJoin))) + return null; + + List listMapJoinsNoRed = ctx.getListMapJoinsNoRed(); + if (listMapJoinsNoRed == null) + listMapJoinsNoRed = new ArrayList(); + listMapJoinsNoRed.add(mapJoin); + ctx.setListMapJoins(listMapJoinsNoRed); + return null; + } + } + + public static class MapJoinDefault implements NodeProcessor { + + /** + * Store the mapjoin in a rejected list + */ + @Override + public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx, + Object... nodeOutputs) throws SemanticException { + MapJoinWalkerCtx ctx = (MapJoinWalkerCtx)procCtx; + MapJoinOperator mapJoin = ctx.getCurrMapJoinOp(); + List listRejectedMapJoins = ctx.getListRejectedMapJoins(); + if (listRejectedMapJoins == null) + listRejectedMapJoins = new ArrayList(); + listRejectedMapJoins.add(mapJoin); + ctx.setListRejectedMapJoins(listRejectedMapJoins); + return null; + } + } + + public static class Default implements NodeProcessor { + + /** + * nothing to do + */ + @Override + public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx, + Object... nodeOutputs) throws SemanticException { + return null; + } + } + + public static NodeProcessor getMapJoinFS() { + return new MapJoinFS(); + } + + public static NodeProcessor getMapJoinDefault() { + return new MapJoinDefault(); + } + + public static NodeProcessor getDefault() { + return new Default(); + } + + public static NodeProcessor getCurrentMapJoin() { + return new CurrentMapJoin(); + } + + public static class MapJoinWalkerCtx implements NodeProcessorCtx { + List listMapJoinsNoRed; + List listRejectedMapJoins; + MapJoinOperator currMapJoinOp; + + /** + * @param listMapJoins + */ + public MapJoinWalkerCtx(List listMapJoinsNoRed) { + this.listMapJoinsNoRed = listMapJoinsNoRed; + this.currMapJoinOp = null; + this.listRejectedMapJoins = new ArrayList(); + } + + /** + * @return the listMapJoins + */ + public List getListMapJoinsNoRed() { + return listMapJoinsNoRed; + } + + /** + * @param listMapJoins the listMapJoins to set + */ + public void setListMapJoins(List listMapJoinsNoRed) { + this.listMapJoinsNoRed = listMapJoinsNoRed; + } + + /** + * @return the currMapJoinOp + */ + public MapJoinOperator getCurrMapJoinOp() { + return currMapJoinOp; + } + + /** + * @param currMapJoinOp the currMapJoinOp to set + */ + public void setCurrMapJoinOp(MapJoinOperator currMapJoinOp) { + this.currMapJoinOp = currMapJoinOp; + } + + /** + * @return the listRejectedMapJoins + */ + public List getListRejectedMapJoins() { + return listRejectedMapJoins; + } + + /** + * @param listRejectedMapJoins the listRejectedMapJoins to set + */ + public void setListRejectedMapJoins(List listRejectedMapJoins) { + this.listRejectedMapJoins = listRejectedMapJoins; + } + } } Modified: hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java (original) +++ hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java Mon Jun 29 19:33:43 2009 @@ -24,6 +24,7 @@ import java.util.Map; import org.apache.hadoop.hive.ql.exec.JoinOperator; +import org.apache.hadoop.hive.ql.exec.MapJoinOperator; import org.apache.hadoop.hive.ql.exec.Operator; import org.apache.hadoop.hive.ql.plan.loadFileDesc; import org.apache.hadoop.hive.ql.plan.loadTableDesc; @@ -57,6 +58,7 @@ private HashMap idToTableNameMap; private int destTableId; private UnionProcContext uCtx; + private List listMapJoinOpsNoReducer; // list of map join operators with no reducer /** * @param qb @@ -78,6 +80,8 @@ * list of operators for the top query * @param topSelOps * list of operators for the selects introduced for column pruning + * @param listMapJoinOpsNoReducer + * list of map join operators with no reducer */ public ParseContext(HiveConf conf, QB qb, ASTNode ast, HashMap aliasToPruner, @@ -87,7 +91,8 @@ HashMap, OpParseContext> opParseCtx, Map joinContext, List loadTableWork, List loadFileWork, - Context ctx, HashMap idToTableNameMap, int destTableId, UnionProcContext uCtx) { + Context ctx, HashMap idToTableNameMap, int destTableId, UnionProcContext uCtx, + List listMapJoinOpsNoReducer) { this.conf = conf; this.qb = qb; this.ast = ast; @@ -103,6 +108,7 @@ this.idToTableNameMap = idToTableNameMap; this.destTableId = destTableId; this.uCtx = uCtx; + this.listMapJoinOpsNoReducer = listMapJoinOpsNoReducer; } /** @@ -311,4 +317,18 @@ this.joinContext = joinContext; } + /** + * @return the listMapJoinOpsNoReducer + */ + public List getListMapJoinOpsNoReducer() { + return listMapJoinOpsNoReducer; + } + + /** + * @param listMapJoinOpsNoReducer the listMapJoinOpsNoReducer to set + */ + public void setListMapJoinOpsNoReducer( + List listMapJoinOpsNoReducer) { + this.listMapJoinOpsNoReducer = listMapJoinOpsNoReducer; + } } Modified: hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java (original) +++ hadoop/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java Mon Jun 29 19:33:43 2009 @@ -44,6 +44,7 @@ import org.apache.hadoop.hive.ql.exec.ColumnInfo; import org.apache.hadoop.hive.ql.exec.FunctionRegistry; import org.apache.hadoop.hive.ql.exec.JoinOperator; +import org.apache.hadoop.hive.ql.exec.MapJoinOperator; import org.apache.hadoop.hive.ql.exec.Operator; import org.apache.hadoop.hive.ql.exec.OperatorFactory; import org.apache.hadoop.hive.ql.exec.ReduceSinkOperator; @@ -144,6 +145,7 @@ private ASTNode ast; private int destTableId; private UnionProcContext uCtx; + List listMapJoinOpsNoReducer; /** * ReadEntitites that are passed to the hooks. @@ -173,6 +175,7 @@ joinContext = new HashMap(); this.destTableId = 1; this.uCtx = null; + this.listMapJoinOpsNoReducer = new ArrayList(); inputs = new LinkedHashSet(); outputs = new LinkedHashSet(); @@ -210,12 +213,14 @@ destTableId = pctx.getDestTableId(); idToTableNameMap = pctx.getIdToTableNameMap(); this.uCtx = pctx.getUCtx(); + this.listMapJoinOpsNoReducer = pctx.getListMapJoinOpsNoReducer(); qb = pctx.getQB(); } public ParseContext getParseContext() { return new ParseContext(conf, qb, ast, aliasToPruner, aliasToSamplePruner, topOps, - topSelOps, opParseCtx, joinContext, loadTableWork, loadFileWork, ctx, idToTableNameMap, destTableId, uCtx); + topSelOps, opParseCtx, joinContext, loadTableWork, loadFileWork, ctx, idToTableNameMap, destTableId, uCtx, + listMapJoinOpsNoReducer); } @SuppressWarnings("nls") @@ -3863,7 +3868,8 @@ ParseContext pCtx = new ParseContext(conf, qb, ast, aliasToPruner, aliasToSamplePruner, topOps, - topSelOps, opParseCtx, joinContext, loadTableWork, loadFileWork, ctx, idToTableNameMap, destTableId, uCtx); + topSelOps, opParseCtx, joinContext, loadTableWork, loadFileWork, + ctx, idToTableNameMap, destTableId, uCtx, listMapJoinOpsNoReducer); Optimizer optm = new Optimizer(); optm.setPctx(pCtx); Modified: hadoop/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestOperators.java URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestOperators.java?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestOperators.java (original) +++ hadoop/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestOperators.java Mon Jun 29 19:33:43 2009 @@ -89,7 +89,7 @@ op.setConf(filterCtx); // runtime initialization - op.initialize(null, null, new ObjectInspector[]{r[0].oi}); + op.initialize(new JobConf(TestOperators.class), null, new ObjectInspector[]{r[0].oi}); for(InspectableObject oner: r) { op.process(oner.o, oner.oi, 0); Modified: hadoop/hive/trunk/ql/src/test/results/clientpositive/join25.q.out URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/test/results/clientpositive/join25.q.out?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/test/results/clientpositive/join25.q.out (original) +++ hadoop/hive/trunk/ql/src/test/results/clientpositive/join25.q.out Mon Jun 29 19:33:43 2009 @@ -9,9 +9,8 @@ STAGE DEPENDENCIES: Stage-1 is a root stage - Stage-2 depends on stages: Stage-1 - Stage-5 depends on stages: Stage-2 - Stage-0 depends on stages: Stage-5 + Stage-4 depends on stages: Stage-1 + Stage-0 depends on stages: Stage-4 STAGE PLANS: Stage: Stage-1 @@ -28,12 +27,38 @@ 0 1 Position of Big Table: 1 - File Output Operator - compressed: false - GlobalTableId: 0 - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + Select Operator + expressions: + expr: _col0 + type: string + expr: _col1 + type: string + expr: _col3 + type: string + Select Operator + expressions: + expr: _col0 + type: string + expr: _col1 + type: string + expr: _col3 + type: string + Select Operator + expressions: + expr: UDFToInteger(_col0) + type: int + expr: _col1 + type: string + expr: _col2 + type: string + File Output Operator + compressed: false + GlobalTableId: 1 + table: + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + name: dest_j1 Local Work: Map Reduce Local Work Alias -> Map Local Tables: @@ -52,60 +77,49 @@ 0 1 Position of Big Table: 1 - File Output Operator - compressed: false - GlobalTableId: 0 - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat - - Stage: Stage-2 - Map Reduce - Alias -> Map Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1076139727/10002 - Select Operator - expressions: - expr: _col0 - type: string - expr: _col1 - type: string - expr: _col3 - type: string - Select Operator - expressions: - expr: _col0 - type: string - expr: _col1 - type: string - expr: _col3 - type: string - Select Operator - expressions: - expr: UDFToInteger(_col0) - type: int - expr: _col1 - type: string - expr: _col2 - type: string - File Output Operator - compressed: false - GlobalTableId: 1 - table: - input format: org.apache.hadoop.mapred.TextInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat - serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe - name: dest_j1 + Select Operator + expressions: + expr: _col0 + type: string + expr: _col1 + type: string + expr: _col3 + type: string + Select Operator + expressions: + expr: _col0 + type: string + expr: _col1 + type: string + expr: _col3 + type: string + Select Operator + expressions: + expr: UDFToInteger(_col0) + type: int + expr: _col1 + type: string + expr: _col2 + type: string + File Output Operator + compressed: false + GlobalTableId: 1 + table: + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + name: dest_j1 - Stage: Stage-5 + Stage: Stage-4 Conditional Operator list of dependent Tasks: Move Operator files: hdfs directory: true - destination: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/554961035/10000 + destination: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/69785752/10000 Map Reduce Alias -> Map Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1076139727/10003 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/527658600/10002 Reduce Output Operator sort order: Map-reduce partition columns: @@ -149,7 +163,7 @@ Output: default/dest_j1 query: select * from dest_j1 x order by x.key Input: default/dest_j1 -Output: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/2016361816/10000 +Output: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/998760334/10000 66 val_66 val_66 98 val_98 val_98 98 val_98 val_98 Modified: hadoop/hive/trunk/ql/src/test/results/clientpositive/join26.q.out URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/test/results/clientpositive/join26.q.out?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/test/results/clientpositive/join26.q.out (original) +++ hadoop/hive/trunk/ql/src/test/results/clientpositive/join26.q.out Mon Jun 29 19:33:43 2009 @@ -9,9 +9,8 @@ STAGE DEPENDENCIES: Stage-1 is a root stage - Stage-2 depends on stages: Stage-1 - Stage-5 depends on stages: Stage-2 - Stage-0 depends on stages: Stage-5 + Stage-4 depends on stages: Stage-1 + Stage-0 depends on stages: Stage-4 STAGE PLANS: Stage: Stage-1 @@ -43,16 +42,42 @@ 1 2 Position of Big Table: 2 - File Output Operator - compressed: false - GlobalTableId: 0 - directory: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1938408572/10002 - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat - properties: - columns _col0,_col3,_col5 - columns.types string,string,string + Select Operator + expressions: + expr: _col0 + type: string + expr: _col3 + type: string + expr: _col5 + type: string + Select Operator + expressions: + expr: _col0 + type: string + expr: _col5 + type: string + expr: _col3 + type: string + File Output Operator + compressed: false + GlobalTableId: 1 + directory: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/107323288/10002 + table: + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + properties: + name dest_j1 + columns.types string:string:string + serialization.ddl struct dest_j1 { string key, string value, string val2} + serialization.format 1 + columns key,value,val2 + bucket_count -1 + serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + file.inputformat org.apache.hadoop.mapred.TextInputFormat + file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + location file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/test/data/warehouse/dest_j1 + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + name: dest_j1 Local Work: Map Reduce Local Work Alias -> Map Local Tables: @@ -77,16 +102,42 @@ 1 2 Position of Big Table: 2 - File Output Operator - compressed: false - GlobalTableId: 0 - directory: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1938408572/10002 - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat - properties: - columns _col0,_col3,_col5 - columns.types string,string,string + Select Operator + expressions: + expr: _col0 + type: string + expr: _col3 + type: string + expr: _col5 + type: string + Select Operator + expressions: + expr: _col0 + type: string + expr: _col5 + type: string + expr: _col3 + type: string + File Output Operator + compressed: false + GlobalTableId: 1 + directory: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/107323288/10002 + table: + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + properties: + name dest_j1 + columns.types string:string:string + serialization.ddl struct dest_j1 { string key, string value, string val2} + serialization.format 1 + columns key,value,val2 + bucket_count -1 + serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + file.inputformat org.apache.hadoop.mapred.TextInputFormat + file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + location file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/test/data/warehouse/dest_j1 + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + name: dest_j1 x Common Join Operator condition map: @@ -101,21 +152,47 @@ 1 2 Position of Big Table: 2 - File Output Operator - compressed: false - GlobalTableId: 0 - directory: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1938408572/10002 - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat - properties: - columns _col0,_col3,_col5 - columns.types string,string,string + Select Operator + expressions: + expr: _col0 + type: string + expr: _col3 + type: string + expr: _col5 + type: string + Select Operator + expressions: + expr: _col0 + type: string + expr: _col5 + type: string + expr: _col3 + type: string + File Output Operator + compressed: false + GlobalTableId: 1 + directory: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/107323288/10002 + table: + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + properties: + name dest_j1 + columns.types string:string:string + serialization.ddl struct dest_j1 { string key, string value, string val2} + serialization.format 1 + columns key,value,val2 + bucket_count -1 + serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + file.inputformat org.apache.hadoop.mapred.TextInputFormat + file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + location file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/test/data/warehouse/dest_j1 + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + name: dest_j1 Needs Tagging: false Path -> Alias: - file:/Users/char/Documents/workspace/Hive-460/build/ql/test/data/warehouse/srcpart/ds=2008-04-08/hr=11 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/test/data/warehouse/srcpart/ds=2008-04-08/hr=11 Path -> Partition: - file:/Users/char/Documents/workspace/Hive-460/build/ql/test/data/warehouse/srcpart/ds=2008-04-08/hr=11 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/test/data/warehouse/srcpart/ds=2008-04-08/hr=11 Partition partition values: ds 2008-04-08 @@ -134,74 +211,21 @@ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe file.inputformat org.apache.hadoop.mapred.TextInputFormat file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat - location file:/Users/char/Documents/workspace/Hive-460/build/ql/test/data/warehouse/srcpart + location file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/test/data/warehouse/srcpart serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe name: srcpart - Stage: Stage-2 - Map Reduce - Alias -> Map Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1938408572/10002 - Select Operator - expressions: - expr: _col0 - type: string - expr: _col3 - type: string - expr: _col5 - type: string - Select Operator - expressions: - expr: _col0 - type: string - expr: _col5 - type: string - expr: _col3 - type: string - File Output Operator - compressed: false - GlobalTableId: 1 - directory: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1938408572/10003 - table: - input format: org.apache.hadoop.mapred.TextInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat - properties: - name dest_j1 - columns.types string:string:string - serialization.ddl struct dest_j1 { string key, string value, string val2} - serialization.format 1 - columns key,value,val2 - bucket_count -1 - serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe - file.inputformat org.apache.hadoop.mapred.TextInputFormat - file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat - location file:/Users/char/Documents/workspace/Hive-460/build/ql/test/data/warehouse/dest_j1 - serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe - name: dest_j1 - Needs Tagging: false - Path -> Alias: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1938408572/10002 - Path -> Partition: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1938408572/10002 - Partition - - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat - properties: - columns _col0,_col3,_col5 - columns.types string,string,string - - Stage: Stage-5 + Stage: Stage-4 Conditional Operator list of dependent Tasks: Move Operator files: hdfs directory: true - source: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1938408572/10003 - destination: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/934580575/10000 + source: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/107323288/10002 + destination: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/991349586/10000 Map Reduce Alias -> Map Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1938408572/10003 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/107323288/10002 Reduce Output Operator sort order: Map-reduce partition columns: @@ -217,9 +241,9 @@ type: string Needs Tagging: false Path -> Alias: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1938408572/10003 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/107323288/10002 Path -> Partition: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1938408572/10003 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/107323288/10002 Partition input format: org.apache.hadoop.mapred.TextInputFormat @@ -234,7 +258,7 @@ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe file.inputformat org.apache.hadoop.mapred.TextInputFormat file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat - location file:/Users/char/Documents/workspace/Hive-460/build/ql/test/data/warehouse/dest_j1 + location file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/test/data/warehouse/dest_j1 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe name: dest_j1 Reduce Operator Tree: @@ -242,7 +266,7 @@ File Output Operator compressed: false GlobalTableId: 0 - directory: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/934580575/10000 + directory: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/991349586/10000 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat @@ -256,7 +280,7 @@ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe file.inputformat org.apache.hadoop.mapred.TextInputFormat file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat - location file:/Users/char/Documents/workspace/Hive-460/build/ql/test/data/warehouse/dest_j1 + location file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/test/data/warehouse/dest_j1 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe name: dest_j1 @@ -264,7 +288,7 @@ Move Operator tables: replace: true - source: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/934580575/10000 + source: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/991349586/10000 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat @@ -278,10 +302,10 @@ serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe file.inputformat org.apache.hadoop.mapred.TextInputFormat file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat - location file:/Users/char/Documents/workspace/Hive-460/build/ql/test/data/warehouse/dest_j1 + location file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/test/data/warehouse/dest_j1 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe name: dest_j1 - tmp directory: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/934580575/10001 + tmp directory: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/991349586/10001 query: INSERT OVERWRITE TABLE dest_j1 @@ -294,7 +318,7 @@ Output: default/dest_j1 query: select * from dest_j1 x order by x.key Input: default/dest_j1 -Output: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1666620544/10000 +Output: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/350810175/10000 128 val_128 val_128 128 val_128 val_128 128 val_128 val_128 Modified: hadoop/hive/trunk/ql/src/test/results/clientpositive/join27.q.out URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/test/results/clientpositive/join27.q.out?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/test/results/clientpositive/join27.q.out (original) +++ hadoop/hive/trunk/ql/src/test/results/clientpositive/join27.q.out Mon Jun 29 19:33:43 2009 @@ -9,9 +9,8 @@ STAGE DEPENDENCIES: Stage-1 is a root stage - Stage-2 depends on stages: Stage-1 - Stage-5 depends on stages: Stage-2 - Stage-0 depends on stages: Stage-5 + Stage-4 depends on stages: Stage-1 + Stage-0 depends on stages: Stage-4 STAGE PLANS: Stage: Stage-1 @@ -28,12 +27,38 @@ 0 1 Position of Big Table: 1 - File Output Operator - compressed: false - GlobalTableId: 0 - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + Select Operator + expressions: + expr: _col0 + type: string + expr: _col1 + type: string + expr: _col3 + type: string + Select Operator + expressions: + expr: _col0 + type: string + expr: _col1 + type: string + expr: _col3 + type: string + Select Operator + expressions: + expr: UDFToInteger(_col0) + type: int + expr: _col1 + type: string + expr: _col2 + type: string + File Output Operator + compressed: false + GlobalTableId: 1 + table: + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + name: dest_j1 Local Work: Map Reduce Local Work Alias -> Map Local Tables: @@ -52,60 +77,49 @@ 0 1 Position of Big Table: 1 - File Output Operator - compressed: false - GlobalTableId: 0 - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat - - Stage: Stage-2 - Map Reduce - Alias -> Map Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1103571809/10002 - Select Operator - expressions: - expr: _col0 - type: string - expr: _col1 - type: string - expr: _col3 - type: string - Select Operator - expressions: - expr: _col0 - type: string - expr: _col1 - type: string - expr: _col3 - type: string - Select Operator - expressions: - expr: UDFToInteger(_col0) - type: int - expr: _col1 - type: string - expr: _col2 - type: string - File Output Operator - compressed: false - GlobalTableId: 1 - table: - input format: org.apache.hadoop.mapred.TextInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat - serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe - name: dest_j1 + Select Operator + expressions: + expr: _col0 + type: string + expr: _col1 + type: string + expr: _col3 + type: string + Select Operator + expressions: + expr: _col0 + type: string + expr: _col1 + type: string + expr: _col3 + type: string + Select Operator + expressions: + expr: UDFToInteger(_col0) + type: int + expr: _col1 + type: string + expr: _col2 + type: string + File Output Operator + compressed: false + GlobalTableId: 1 + table: + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + name: dest_j1 - Stage: Stage-5 + Stage: Stage-4 Conditional Operator list of dependent Tasks: Move Operator files: hdfs directory: true - destination: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1381867998/10000 + destination: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/1256637782/10000 Map Reduce Alias -> Map Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1103571809/10003 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/1722272427/10002 Reduce Output Operator sort order: Map-reduce partition columns: @@ -149,7 +163,7 @@ Output: default/dest_j1 query: select * from dest_j1 x order by x.key, x.value Input: default/dest_j1 -Output: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1375214861/10000 +Output: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/6969413/10000 NULL val_165 val_165 NULL val_165 val_165 NULL val_193 val_193 Modified: hadoop/hive/trunk/ql/src/test/results/clientpositive/join28.q.out URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/test/results/clientpositive/join28.q.out?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/test/results/clientpositive/join28.q.out (original) +++ hadoop/hive/trunk/ql/src/test/results/clientpositive/join28.q.out Mon Jun 29 19:33:43 2009 @@ -13,9 +13,8 @@ STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 - Stage-3 depends on stages: Stage-2 - Stage-6 depends on stages: Stage-3 - Stage-0 depends on stages: Stage-6 + Stage-5 depends on stages: Stage-2 + Stage-0 depends on stages: Stage-5 STAGE PLANS: Stage: Stage-1 @@ -66,7 +65,7 @@ Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/2012162302/10002 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/1238929189/10002 Select Operator expressions: expr: _col0 @@ -85,12 +84,26 @@ 0 1 Position of Big Table: 0 - File Output Operator - compressed: false - GlobalTableId: 0 - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + Select Operator + expressions: + expr: _col0 + type: string + expr: _col5 + type: string + Select Operator + expressions: + expr: _col0 + type: string + expr: _col5 + type: string + File Output Operator + compressed: false + GlobalTableId: 1 + table: + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + name: dest_j1 Local Work: Map Reduce Local Work Alias -> Map Local Tables: @@ -121,48 +134,37 @@ 0 1 Position of Big Table: 0 - File Output Operator - compressed: false - GlobalTableId: 0 - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat - - Stage: Stage-3 - Map Reduce - Alias -> Map Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/2012162302/10003 - Select Operator - expressions: - expr: _col0 - type: string - expr: _col5 - type: string - Select Operator - expressions: - expr: _col0 - type: string - expr: _col5 - type: string - File Output Operator - compressed: false - GlobalTableId: 1 - table: - input format: org.apache.hadoop.mapred.TextInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat - serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe - name: dest_j1 + Select Operator + expressions: + expr: _col0 + type: string + expr: _col5 + type: string + Select Operator + expressions: + expr: _col0 + type: string + expr: _col5 + type: string + File Output Operator + compressed: false + GlobalTableId: 1 + table: + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + name: dest_j1 - Stage: Stage-6 + Stage: Stage-5 Conditional Operator list of dependent Tasks: Move Operator files: hdfs directory: true - destination: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/16485935/10000 + destination: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/2043029458/10000 Map Reduce Alias -> Map Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/2012162302/10004 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/1238929189/10003 Reduce Output Operator sort order: Map-reduce partition columns: @@ -208,7 +210,7 @@ Output: default/dest_j1 query: select * from dest_j1 x order by x.key Input: default/dest_j1 -Output: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/634729814/10000 +Output: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/981515436/10000 128 val_128 128 val_128 128 val_128 Modified: hadoop/hive/trunk/ql/src/test/results/clientpositive/join29.q.out URL: http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/test/results/clientpositive/join29.q.out?rev=789416&r1=789415&r2=789416&view=diff ============================================================================== --- hadoop/hive/trunk/ql/src/test/results/clientpositive/join29.q.out (original) +++ hadoop/hive/trunk/ql/src/test/results/clientpositive/join29.q.out Mon Jun 29 19:33:43 2009 @@ -10,15 +10,13 @@ STAGE DEPENDENCIES: Stage-1 is a root stage - Stage-2 depends on stages: Stage-1, Stage-7 - Stage-3 depends on stages: Stage-2 - Stage-6 depends on stages: Stage-3 - Stage-0 depends on stages: Stage-6 - Stage-7 is a root stage - Stage-2 depends on stages: Stage-1, Stage-7 - Stage-3 depends on stages: Stage-2 - Stage-6 depends on stages: Stage-3 - Stage-0 depends on stages: Stage-6 + Stage-2 depends on stages: Stage-1, Stage-6 + Stage-5 depends on stages: Stage-2 + Stage-0 depends on stages: Stage-5 + Stage-6 is a root stage + Stage-2 depends on stages: Stage-1, Stage-6 + Stage-5 depends on stages: Stage-2 + Stage-0 depends on stages: Stage-5 STAGE PLANS: Stage: Stage-1 @@ -73,7 +71,7 @@ Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1590163898/10002 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/457091933/10002 Common Join Operator condition map: Inner Join 0 to 1 @@ -84,20 +82,46 @@ 0 1 Position of Big Table: 1 - File Output Operator - compressed: false - GlobalTableId: 0 - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + Select Operator + expressions: + expr: _col0 + type: string + expr: _col1 + type: bigint + expr: _col3 + type: bigint + Select Operator + expressions: + expr: _col0 + type: string + expr: _col1 + type: bigint + expr: _col3 + type: bigint + Select Operator + expressions: + expr: _col0 + type: string + expr: UDFToInteger(_col1) + type: int + expr: UDFToInteger(_col2) + type: int + File Output Operator + compressed: false + GlobalTableId: 1 + table: + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + name: dest_j1 Local Work: Map Reduce Local Work Alias -> Map Local Tables: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1590163898/10005 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/457091933/10004 Fetch Operator limit: -1 Alias -> Map Local Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1590163898/10005 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/457091933/10004 Common Join Operator condition map: Inner Join 0 to 1 @@ -108,60 +132,49 @@ 0 1 Position of Big Table: 1 - File Output Operator - compressed: false - GlobalTableId: 0 - table: - input format: org.apache.hadoop.mapred.SequenceFileInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat - - Stage: Stage-3 - Map Reduce - Alias -> Map Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1590163898/10003 - Select Operator - expressions: - expr: _col0 - type: string - expr: _col1 - type: bigint - expr: _col3 - type: bigint - Select Operator - expressions: - expr: _col0 - type: string - expr: _col1 - type: bigint - expr: _col3 - type: bigint - Select Operator - expressions: - expr: _col0 - type: string - expr: UDFToInteger(_col1) - type: int - expr: UDFToInteger(_col2) - type: int - File Output Operator - compressed: false - GlobalTableId: 1 - table: - input format: org.apache.hadoop.mapred.TextInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat - serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe - name: dest_j1 + Select Operator + expressions: + expr: _col0 + type: string + expr: _col1 + type: bigint + expr: _col3 + type: bigint + Select Operator + expressions: + expr: _col0 + type: string + expr: _col1 + type: bigint + expr: _col3 + type: bigint + Select Operator + expressions: + expr: _col0 + type: string + expr: UDFToInteger(_col1) + type: int + expr: UDFToInteger(_col2) + type: int + File Output Operator + compressed: false + GlobalTableId: 1 + table: + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + name: dest_j1 - Stage: Stage-6 + Stage: Stage-5 Conditional Operator list of dependent Tasks: Move Operator files: hdfs directory: true - destination: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/959537275/10000 + destination: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/1060447709/10000 Map Reduce Alias -> Map Operator Tree: - file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/1590163898/10004 + file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/457091933/10003 Reduce Output Operator sort order: Map-reduce partition columns: @@ -196,7 +209,7 @@ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe name: dest_j1 - Stage: Stage-7 + Stage: Stage-6 Map Reduce Alias -> Map Operator Tree: subq1:x @@ -255,7 +268,7 @@ Output: default/dest_j1 query: select * from dest_j1 x order by x.key Input: default/dest_j1 -Output: file:/Users/char/Documents/workspace/Hive-460/build/ql/tmp/948921556/10000 +Output: file:/data/users/njain/deploy/hive1/tools/ahive1-trunk-apache-hive/build/ql/tmp/2107397936/10000 128 1 3 146 1 2 150 1 1