Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3C82ADB9C for ; Tue, 14 May 2013 18:13:18 +0000 (UTC) Received: (qmail 96155 invoked by uid 500); 14 May 2013 18:13:17 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 96088 invoked by uid 500); 14 May 2013 18:13:17 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 95973 invoked by uid 99); 14 May 2013 18:13:17 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 May 2013 18:13:17 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id B22191C9BD1; Tue, 14 May 2013 18:13:12 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============2063680498224510628==" MIME-Version: 1.0 Subject: Re: Review Request: Column Column, and Column Scalar vectorized execution tests From: "Eric Hanson" To: "Sarvesh Sakalanaga" , "Remus Rusanu" , "Jitendra Pandey" , "Eric Hanson" Cc: "tony murphy" , "hive" Date: Tue, 14 May 2013 18:13:12 -0000 Message-ID: <20130514181312.22989.15565@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org Auto-Submitted: auto-generated Sender: "Eric Hanson" X-ReviewGroup: hive X-ReviewRequest-URL: https://reviews.apache.org/r/11133/ X-Sender: "Eric Hanson" References: <20130514003408.22989.69786@reviews.apache.org> In-Reply-To: <20130514003408.22989.69786@reviews.apache.org> Reply-To: "Eric Hanson" --===============2063680498224510628== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11133/#review20538 ----------------------------------------------------------- ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tCodeGen.java missing apache license ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tCodeGen.java classes need javadoc comment at beginning ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tCodeGen.java See java coding style guide = http://www.oracle.com/technetwork/java/codeconv-138413.html = Differences for Hive: indent 2 chars not 4, and 100 char line length li= mit ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tCodeGen.java need comment to explain purpose and contents of this matrix ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tColumnColumnFilterVectorExpressionEvaluation.txt All in all this is a really nice approach. Nice clean test all on one p= age, fully templatized. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tColumnColumnFilterVectorExpressionEvaluation.txt BATCH_SIZE is optional in the constructors now and not normally recomme= nded. The default is recommended. But that should not really matter here fo= r correctness. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tColumnColumnFilterVectorExpressionEvaluation.txt Style issue with curly braces. Run ant checkstyle. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tColumnColumnFilterVectorExpressionEvaluation.txt selectedInUse can be true even of all rows have been filtered out = If all rows qualify against a filter, an optimization is to set selecte= dInUse back to false. But this is not mandatory for correctness. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tColumnColumnOperationVectorExpressionEvaluation.txt you have quit a bit of extra trailing white space that shows up red in = the review tool. I think you can set eclipse to get rid of it but I'm not s= ure how. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tColumnColumnOperationVectorExpressionEvaluation.txt style guide says put blank after comma ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tColumnScalarFilterVectorExpressionEvaluation.txt what if input has noNulls? Then you should not look at isNull array. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tColumnScalarFilterVectorExpressionEvaluation.txt if the optimization is used whereby if every row qualifies, you set sel= ectedInUse to false, this could give a false failure ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tColumnScalarFilterVectorExpressionEvaluation.txt See earlier comment on this. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tColumnScalarOperationVectorExpressionEvaluation.txt test .txt templates need apache license ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tColumnScalarOperationVectorExpressionEvaluation.txt scalarValue set but never used ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/Tes= tColumnScalarOperationVectorExpressionEvaluation.txt not supposed to look at isNull if noNulls is set for vector ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/TestColum= nColumnFilterVectorExpressionEvaluation.java Why not just used VectorizedRowBatch.DEFAULT_SIZE? is there a reason? - Eric Hanson On May 14, 2013, 12:34 a.m., tony murphy wrote: > = > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/11133/ > ----------------------------------------------------------- > = > (Updated May 14, 2013, 12:34 a.m.) > = > = > Review request for hive, Jitendra Pandey, Eric Hanson, Sarvesh Sakalanaga= , and Remus Rusanu. > = > = > Description > ------- > = > This patch adds Column Column, and Column Scalar vectorized execution tes= ts. These tests are generated in parallel with the vectorized expressions. = The tests focus is on validating the column vector and the vectorized row b= atch metadata regarding nulls, repeating, and selection. > = > Overview of Changes: > = > CodeGen.java: > + joinPath, getCamelCaseType, readFile and writeFile made static for use = in TestCodeGen.java. > + filter types now specify null as their output type rather than "doesn't= matter" to make detection for test generation easier. > + support for test generation added. > = > TestCodeGen.java & Templates: = > TestClass.txt > TestColumnColumnFilterVectorExpressionEvaluation.txt, > TestColumnColumnOperationVectorExpressionEvaluation.txt, > TestColumnScalarFilterVectorExpressionEvaluation.txt, > TestColumnScalarOperationVectorExpressionEvaluation.txt > +This class is mutable and maintains a hashmap of TestSuiteClassName to t= est cases. The tests cases are added over the course of vectorized expressi= ons class generation, with test classes being outputted at the end. For eac= h column vector (inputs and/or outputs) a matrix of pairwise covering Boole= ans is used to generate test cases across nulls and repeating dimensions. B= ased on the input column vector(s) nulls and repeating states the states of= the output column vector (if there is one) is validated, along with the nu= ll vector. For filter operations the selection vector is validated against = the generated data. Each template corresponds to a class representing a tes= t suite. > = > VectorizedRowGroupUtil.java > +added methods generateLongColumnVector and generateDoubleColumnVector fo= r generating the respective column vectors with optional nulls and/or repea= ting values. > = > = > This addresses bug HIVE-4553. > https://issues.apache.org/jira/browse/HIVE-4553 > = > = > Diffs > ----- > = > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates= /CodeGen.java 53d9a7a = > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates= /TestClass.txt PRE-CREATION = > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates= /TestCodeGen.java PRE-CREATION = > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates= /TestColumnColumnFilterVectorExpressionEvaluation.txt PRE-CREATION = > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates= /TestColumnColumnOperationVectorExpressionEvaluation.txt PRE-CREATION = > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates= /TestColumnScalarFilterVectorExpressionEvaluation.txt PRE-CREATION = > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates= /TestColumnScalarOperationVectorExpressionEvaluation.txt PRE-CREATION = > ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/TestC= olumnColumnFilterVectorExpressionEvaluation.java PRE-CREATION = > ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/TestC= olumnColumnOperationVectorExpressionEvaluation.java PRE-CREATION = > ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/TestC= olumnScalarFilterVectorExpressionEvaluation.java PRE-CREATION = > ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/TestC= olumnScalarOperationVectorExpressionEvaluation.java PRE-CREATION = > ql/src/test/org/apache/hadoop/hive/ql/exec/vector/util/VectorizedRowGro= upGenUtil.java 8a07567 = > = > Diff: https://reviews.apache.org/r/11133/diff/ > = > = > Testing > ------- > = > generated tests, and ran them. > = > = > Thanks, > = > tony murphy > = > --===============2063680498224510628==--