Return-Path: Delivered-To: apmail-hadoop-common-commits-archive@www.apache.org Received: (qmail 27911 invoked from network); 13 Sep 2010 19:12:09 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 13 Sep 2010 19:12:09 -0000 Received: (qmail 53546 invoked by uid 500); 13 Sep 2010 19:12:08 -0000 Delivered-To: apmail-hadoop-common-commits-archive@hadoop.apache.org Received: (qmail 53501 invoked by uid 500); 13 Sep 2010 19:12:08 -0000 Mailing-List: contact common-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-commits@hadoop.apache.org Received: (qmail 53494 invoked by uid 500); 13 Sep 2010 19:12:07 -0000 Delivered-To: apmail-hadoop-core-commits@hadoop.apache.org Received: (qmail 53491 invoked by uid 99); 13 Sep 2010 19:12:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Sep 2010 19:12:07 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.131] (HELO eos.apache.org) (140.211.11.131) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Sep 2010 19:12:06 +0000 Received: from eosnew.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id A7ACD3CF; Mon, 13 Sep 2010 19:11:38 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Apache Wiki To: Apache Wiki Date: Mon, 13 Sep 2010 19:11:38 -0000 Message-ID: <20100913191138.62043.85998@eosnew.apache.org> Subject: =?utf-8?q?=5BHadoop_Wiki=5D_Update_of_=22Hive/FilterPushdownDev=22_by_Joh?= =?utf-8?q?nSichi?= Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for ch= ange notification. The "Hive/FilterPushdownDev" page has been changed by JohnSichi. http://wiki.apache.org/hadoop/Hive/FilterPushdownDev?action=3Ddiff&rev1=3D1= &rev2=3D2 -------------------------------------------------- = As mentioned above, we want to avoid duplication in code which interprets the filter string (e.g. parsing). As a first cut, we will - provide access to the `ExprNodeDesc` tree (either via a utility + provide access to the `ExprNodeDesc` tree by passing it along in + serialized form as an optional companion to the filter string. In follow= ups, we will provide parsing utilities for the string form. - which parses the filter string, or perhaps by passing it along in - serialized form as an optional companion to the filter string). = - In followups, we can provide utilities for analyzing an expression - tree to identify [[http://en.wikipedia.org/wiki/Sargable|sargable]] - subexpressions. + We will also provide an IndexPredicateAnalyzer class capable of detecting= simple [[http://en.wikipedia.org/wiki/Sargable|sargable]] + subexpressions in an `ExprNodeDesc` tree. In followups, we will provide = support for discriminating and combining more complex indexable subexpressi= ons. + = + {{{ + public class IndexPredicateAnalyzer + { + public IndexPredicateAnalyzer(); + = + /** + * Registers a comparison operator as one which can be satisfied + * by an index search. Unless this is called, analyzePredicate + * will never find any indexable conditions. + * + * @param udfName name of comparison operator as returned + * by either {@link GenericUDFBridge#getUdfName} (for simple UDF's) + * or udf.getClass().getName() (for generic UDF's). + */ + public void addComparisonOp(String udfName); + = + /** + * Clears the set of column names allowed in comparisons. (Initially, = all + * column names are allowed.) + */ + public void clearAllowedColumnNames(); + = + /** + * Adds a column name to the set of column names allowed. + * + * @param columnName name of column to be allowed + */ + public void allowColumnName(String columnName); + = + /** + * Analyzes a predicate. + * + * @param predicate predicate to be analyzed + * + * @param searchConditions receives conditions produced by analysis + * + * @return residual predicate which could not be translated to + * searchConditions + */ + public ExprNodeDesc analyzePredicate( + ExprNodeDesc predicate, + final List searchConditions); + = + /** + * Translates search conditions back to ExprNodeDesc form (as + * a left-deep conjunction). + * + * @param searchConditions (typically produced by analyzePredicate) + * + * @return ExprNodeDesc form of search conditions + */ + public ExprNodeDesc translateSearchConditions( + List searchConditions); + } + = + public class IndexSearchCondition + { + /** + * Constructs a search condition, which takes the form + *
column-ref comparison-op constant-value
. + * + * @param columnDesc column being compared + * + * @param comparisonOp comparison operator, e.g. "=3D" + * (taken from GenericUDFBridge.getUdfName()) + * + * @param constantDesc constant value to search for + * + * @Param comparisonExpr the original comparison expression + */ + public IndexSearchCondition( + ExprNodeColumnDesc columnDesc, + String comparisonOp, + ExprNodeConstantDesc constantDesc, + ExprNodeDesc comparisonExpr); + } + = + = + }}} = =3D=3D Filter Passing =3D=3D = @@ -83, +161 @@ * classes such as `HiveInputFormat` call `ColumnProjectionUtils` to set = the projection pushdown property (READ_COLUMN_IDS_CONF_STR) on a jobConf be= fore instantiating a `RecordReader` * the factory method for the `RecordReader` calls `ColumnProjectionUtils= ` to access this property = - There are a few differences for filter pushdown: + For filter pushdown: = - * the utility methods should be somewhere other than serde (maybe ql) - * the filter needs to be available to getSplits as well since the select= ivity of the filter may alter the splits needed (actually this is true in t= heory for column projection also) + * `HiveInputFormat` sets properties `hive.io.filter.text` (string form) = and `hive.io.filter.expr.serialized` (serialized form of ExprNodeDesc) in t= he job conf before calling getSplits as well as before instantiating a reco= rd reader + * the storage handler's input format reads these properties and processe= s the filter expression - * additional interaction for negotiation of filter decomposition (descri= bed in a later section) + * there is a separate optimizer interaction for negotiation of filter de= composition (described in a later section) + = + Note that getSplits needs to be involved since the selectivity of the fil= ter may prune away some of the splits which would otherwise be accessed. (= In theory column projection could also influence the split boundaries, but = we'll leave that for a followup.) = =3D=3D Filter Collection =3D=3D = - So, where will `HiveInputFormat` and friends get the filter string to be + So, where will `HiveInputFormat` get the filter expression to be passed down? Again, we can start with the pattern for column projections: = * during optimization, `org.apache.hadoop.hive.ql.optimizer.ColumnPruner= ProcFactory's` `ColumnPrunerTableScanProc` populates the pushdown informati= on in `TableScanOperator` @@ -103, +183 @@ called condn, and then sticks that on a new `FilterOperator`. We can call condn.getExprString() and store the result on `TableScanOperator`. = - For getSplits, some more mucking around is going to be required (TBD). + Hive configuration parameter `hive.optimize.ppd.storage` can be used to e= nable or disable pushing filters down to the storage handler. This will be= enabled by default. However, if `hive.optimize.ppd` is disabled, then thi= s implicitly prevents pushdown to storage handlers as well. + = + We are starting with non-native tables only; we'll revisit this for pushi= ng filters down to indexes and builtin storage formats such as RCFile. = =3D=3D Filter Decomposition =3D=3D = @@ -122, +204 @@ In order for this to be possible, the storage handler needs to be able to negotiate the decomposition with Hive. This means that Hive gives the storage handler the entire filter, and the storage handler passes + back a "residual": the portion that needs to be evaluated by Hive. A nu= ll residual indicates that the storage handler was able to deal with the en= tire + filter on its own (in which case no `FilterOperator` is needed). - back just the portion that needs to be evaluated by Hive (or null to - indicate that the storage handler was able to deal with the entire - filter on its own). = + In order to support this interaction, we will introduce a new (optional) = interface to be implemented by storage handlers: - The mechanism for this communication remains TBD. Until it is worked - out, we will start with a sub-optimal approach whereby the storage - handler analyzes the filter and implements the portion it understands, - and Hive re-evaluates the entire filter (implying redundant effort for - the portion already taken care of by the storage handler). = + {{{ + public interface HiveStoragePredicateHandler { + public DecomposedPredicate decomposePredicate( + JobConf jobConf, + Deserializer deserializer, + ExprNodeDesc predicate); + = + public static class DecomposedPredicate { + public ExprNodeDesc pushedPredicate; + public ExprNodeDesc residualPredicate; + } + } + = + }}} + = + Hive's optimizer (during predicate pushdown) calls the decomposePredicate= method, passing in the full expression and receiving back the decompositio= n (or null to indicate that no pushdown was possible). The `pushedPredicat= e` gets passed back to the storage handler's input format later, and the `r= esidualPredicate` is attached to the `FilterOperator`. + = + It is assumed that storage handlers which are sophisticated enough to imp= lement this interface are suitable for tight coupling to the `ExprNodeDesc`= representation. + = + Again, this interface is optional, and pushdown is still possible even wi= thout it. If the storage handler does not implement this interface, Hive w= ill always implement the entire expression in the `FilterOperator`, but it = will still provide the expression to the storage handler's input format; th= e storage handler is free to implement as much or as little as it wants. +=20