drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinfeng Ni (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (DRILL-1383) Allow interpreted tree materialization
Date Sat, 13 Sep 2014 00:32:34 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14132379#comment-14132379
] 

Jinfeng Ni edited comment on DRILL-1383 at 9/13/14 12:32 AM:
-------------------------------------------------------------

This feature would not have impact on Drill's external functionality. An expression evaluated
either by compile & execution model or interpreter model should return identical results.


1. Motivation 
    Given an expression, there are two models to evaluate the expression:
  1)   Compile & execute. This is the current model used by Drill.  In execution time,
the expression will first be materialized. Then, a run-time code generator will produce an
evaluation class for the expression, and such class will be compiled. Then, given list of
incoming RecordBatchs with the identical Schema, Drill will use this compiled class to evaluate
the expression.
  
   2) Interpreter.  Interpreter model could be used either in planning time or execution time.

     In planning time, interpreter model could be used to compute the part of constant expression,
and replace the constant expression with its evaluation result.  Another usage is in partition
pruning, where planner could use the interpreter to evaluate whether the searching filter
is satisfied for one particular partition, in order to pre-determine a subset of candidate
partitions.

    In execution time, Drill's operator may switch from compile & execution model to interpreter,
if the input is small, and it would be time-consuming to generate the run-time code and compile.

2. Interface
   The critical part to support interpreter model is to statically generate an interpreter
class for each Drill function template, including build-in or UDF, during package build process.
 This is different from compile & execution model, where the code generation happens in
run-time. 

As the first step, we only consider using interpreter model for expression made of DrillSimpleFunc.

{code}
public interface DrillSimpleFuncInterpreter extends DrillFuncInterpreter {

  public void doSetup(ValueHolder[] args, RecordBatch incoming);

  public ValueHolder doEval(ValueHolder [] args) ;

}
{code}

A class of InterpreterBuilder would scan all the DrillSimplFunc function template in the Drill
package, and for each of them,  generate an interpreter class. Same as run-time code generation,
the static generation would leverage JCodeModel as well.

A class of InterpreterEvaluator will be responsible to evaluate an expression, given an incoming
RecordBatch, and put the result into an outgoing ValueVector:

  public static void evaluate(RecordBatch incoming, ValueVector outVV, LogicalExpression expr)

The InterpreterEvaluator will leverage a visitor pattern extends AbstractExprVisitor. For
each row, it iterates through the expression tree, and do the evaluation. If one node of the
expression tree is a DrillSimpleFunc, it will call the statically generated interpreter class
for the corresponding DrillSimpleFunc, and get the result of ValueHolder.


   
 


was (Author: jni):
This feature would not have impact on Drill's external functionality. An expression evaluated
either by compile & execution model or interpreter model should return identical results.


1. Motivation 
    Given an expression, there are two models to evaluate the expression:
  1)   Compile & execute. This is the current model used by Drill.  In execution time,
the expression will first be materialized. Then, a run-time code generator will produce an
evaluation class for the expression, and such class will be compiled. Then, given list of
incoming RecordBatchs with the identical Schema, Drill will use this compiled class to evaluate
the expression.
  
   2) Interpreter.  Interpreter model could be used either in planning time or execution time.

     In planning time, interpreter model could be used to compute the part of constant expression,
and replace the constant expression with its evaluation result.  Another usage is in partition
pruning, where planner could use the interpreter to evaluate whether the searching filter
is satisfied for one particular partition, in order to pre-determine a subset of candidate
partitions.

    In execution time, Drill's operator may switch from compile & execution model to interpreter,
if the input is small, and it would be time-consuming to generate the run-time code and compile.

2. Interface
   The critical part to support interpreter model is to statically generate an interpreter
class for each Drill function template, including build-in or UDF, during package build process.
 This is different from compile & execution model, where the code generation happens in
run-time. 

As the first step, we only consider using interpreter model for expression made of DrillSimpleFunc.

public interface DrillSimpleFuncInterpreter extends DrillFuncInterpreter {

  public void doSetup(ValueHolder[] args, RecordBatch incoming);

  public ValueHolder doEval(ValueHolder [] args) ;

}

A class of InterpreterBuilder would scan all the DrillSimplFunc function template in the Drill
package, and for each of them,  generate an interpreter class. Same as run-time code generation,
the static generation would leverage JCodeModel as well.

A class of InterpreterEvaluator will be responsible to evaluate an expression, given an incoming
RecordBatch, and put the result into an outgoing ValueVector:

  public static void evaluate(RecordBatch incoming, ValueVector outVV, LogicalExpression expr)

The InterpreterEvaluator will leverage a visitor pattern extends AbstractExprVisitor. For
each row, it iterates through the expression tree, and do the evaluation. If one node of the
expression tree is a DrillSimpleFunc, it will call the statically generated interpreter class
for the corresponding DrillSimpleFunc, and get the result of ValueHolder.


   
 

> Allow interpreted tree materialization
> --------------------------------------
>
>                 Key: DRILL-1383
>                 URL: https://issues.apache.org/jira/browse/DRILL-1383
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Codegen
>            Reporter: Jacques Nadeau
>            Assignee: Jinfeng Ni
>
> The current code generation paradigm requires an expression tree to be compiled before
it can be evaluated.  This can be time intensive and complex when we need to evaluate an expression
only a small number of times.  We should provide a new interface that avoids code generation
for evaluation of a particular expression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message