drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From weijie tong <tongweijie...@gmail.com>
Subject Re: Is it possible to delegate data joins and filtering to the datasource ?
Date Fri, 31 Mar 2017 11:59:49 GMT
your code seems right , just to implement the 'call.transformTo()' ,but the
left detail , maybe I think I can't express the left things so precisely,
just as @Paul Rogers mentioned the plugin detail is a little trivial.

1.  drillScanRel.getGroupScan  .
2. you need to extend the AbstractGroupScan ,and let it holds some
information about your storage . This defined GroupScan just call it
AGroupScan corresponds to a joint scan RelNode. Then you can define another
GroupScan called BGroupScan which extends AGroupScan, The BGroupScan acts
as a aggregate container which holds the two joint AGroupScan.
3 . The new DrillScanRel has the same RowType as the JoinRel. The
requirement and exmple of transforming between two different RelNodes can
be found from other codes. This DrillScanRel's GroupScan is the BGroupScan.
This new DrillScanRel is the one applys to the code
 `call.transformTo(xxxx)`.

maybe the picture below may help you  understand my idea:


         ---Scan (AGroupScan)
suppose the initial RelNode tree is : Project ----Join --|

  |       ---Scan (AGroupScan)

  |

 \|/
after applied this rule ,the final tree is: Project-----Scan ( BGroupScan (
List(AGroupScan ,AGroupScan) ) )







On Thu, Mar 30, 2017 at 10:01 PM, Muhammad Gelbana <m.gelbana@gmail.com>
wrote:

> *This is my rule class*
>
> public class CartesianProductJoinRule extends RelOptRule {
>
>     public static final CartesianProductJoinRule INSTANCE = new
> CartesianProductJoinRule(DrillJoinRel.class);
>
>     public CartesianProductJoinRule(Class<DrillJoinRel> clazz) {
>         super(operand(clazz, operand(RelNode.class, any()),
> operand(RelNode.class, any())),
>                 "CartesianProductJoin");
>     }
>
>     @Override
>     public boolean matches(RelOptRuleCall call) {
>         DrillJoinRel drillJoin = call.rel(0);
>         return drillJoin.getJoinType() == JoinRelType.INNER &&
> drillJoin.getCondition().isAlwaysTrue();
>     }
>
>     @Override
>     public void onMatch(RelOptRuleCall call) {
>         DrillJoinRel join = call.rel(0);
>         RelNode firstRel = call.rel(1);
>         RelNode secondRel = call.rel(2);
>         HepRelVertex right = (HepRelVertex) join.getRight();
>         HepRelVertex left = (HepRelVertex) join.getLeft();
>
>         List<RelDataTypeField> firstFields = firstRel.getRowType().
> getFieldList();
>         List<RelDataTypeField> secondFields = secondRel.getRowType().
> getFieldList();
>
>         RelNode firstTable = ((HepRelVertex)firstRel.
> getInput(0)).getCurrentRel();
>         RelNode secondTable = ((HepRelVertex)secondRel.
> getInput(0)).getCurrentRel();
>
>         //call.transformTo(???);
>     }
> }
>
> *To register the rule*, I overrode the *getOptimizerRules* method in my
> storage plugin class
>
> public Set<? extends RelOptRule> getOptimizerRules(OptimizerRulesContext
> optimizerContext, PlannerPhase phase) {
>     switch (phase) {
>     case LOGICAL_PRUNE_AND_JOIN:
>     case LOGICAL_PRUNE:
>     case LOGICAL:
>         return getLogicalOptimizerRules(optimizerContext);
>     case PHYSICAL:
>         return getPhysicalOptimizerRules(optimizerContext);
>     case PARTITION_PRUNING:
>     case JOIN_PLANNING:
> *        return ImmutableSet.of(CartesianProductJoinRule.INSTANCE);*
>     default:
>         return ImmutableSet.of();
>     }
>
> }
>
> The rule is firing as expected but I'm lost when it comes to the
> conversion. Earlier, you said "the new equivalent ScanRel is to have the
> joined
> ScanRel nodes's GroupScans", so
>
>    1. How can I obtain the left and right tables group scans ?
>    2. What exactly do you mean by joining them ? Is there a utility method
>    to do so ? Or should I manually create a new single group scan and add
> the
>    information I need there ? Looking into other *GroupScan*
>    implementations, I found that they have references to some runtime
> objects
>    such as the storage plugin and the storage plugin configuration. At this
>    stage, I don't know how to obtain those !
>    3. Precisely, what kind of object should I use to represent a *RelNode*
>    that represents the whole join ? I understand that I need to use an
> object
>    that has implements the *RelNode* interface. Then I should add the
>    created *GroupScan* to that *RelNode* instance and call
>    *call.transformTo(newRelNode)*, correct ?
>
>
> *---------------------*
> *Muhammad Gelbana*
> http://www.linkedin.com/in/mgelbana
>
> On Thu, Mar 30, 2017 at 2:46 AM, weijie tong <tongweijie178@gmail.com>
> wrote:
>
> > I mean the rule you write could be placed in the
> PlannerPhase.JOIN_PlANNING
> > which uses the HepPlanner. This phase is to solve the logical relnode .
> > Hope to help you.
> > Muhammad Gelbana <m.gelbana@gmail.com>于2017年3月30日 周四上午12:07写道:
> >
> > > ​Thanks a lot Weijie, I believe I'm very close now. I hope you don't
> mind
> > > few more questions please:
> > >
> > >
> > >    1. The new rule you are mentioning is a physical rule ? So I should
> > >    implement the Prel interface ?
> > >    2. By "traversing the join to find the ScanRel"
> > >       - This sounds like I have to "search" for something. Shouldn't I
> > just
> > >       work on transforming the left (i.e. DrillJoinRel's getLeft()
> > method)
> > > and
> > >       right (i.e. DrillJoinRel's getLeft() method) join objects ?
> > >       - The "left" and "right" elements of the DrillJoinRel object are
> of
> > >       type RelSubset, not *ScanRel* and I can't find a type called
> > > *ScanRel*.
> > >       I suppose you meant *ScanPrel*, specially because it implements
> the
> > >       *Prel* interface that provides the *getPhysicalOperator* method.
> > >    3. What if multiple physical or logical rules match for a single
> node,
> > >    what decides which rule will be applied and which will be rejected ?
> > Is
> > > it
> > >    the *AbstractRelNode.computeSelfCost(RelOptPlanner)* method ? What
> if
> > >    more than one rule produces the same cost ?
> > >
> > > I'll go ahead and see what I can do for now before hopefully you may
> > offer
> > > more guidance. THANKS A LOT.
> > >
> > > *---------------------*
> > > *Muhammad Gelbana*
> > > http://www.linkedin.com/in/mgelbana
> > >
> > > On Wed, Mar 29, 2017 at 4:23 AM, weijie tong <tongweijie178@gmail.com>
> > > wrote:
> > >
> > > > to avoid misunderstanding , the new equivalent ScanRel is to have the
> > > > joined ScanRel nodes's GroupScans, as the GroupScans indirectly hold
> > the
> > > > underlying storage information.
> > > >
> > > > On Wed, Mar 29, 2017 at 10:15 AM, weijie tong <
> tongweijie178@gmail.com
> > >
> > > > wrote:
> > > >
> > > > >
> > > > > my suggestion is you define a rule which matches the DrillJoinRel
> > > RelNode
> > > > > , then at the onMatch method ,you traverse the join children to
> find
> > > the
> > > > > ScanRel nodes . You define a new ScanRel which include the ScanRel
> > > nodes
> > > > > you find last step. Then transform the JoinRel to this equivalent
> new
> > > > > ScanRel.
> > > > > Finally , the plan tree will not have the JoinRel but the ScanRel.
> > >  You
> > > > > can let your join plan rule  in the PlannerPhase.JOIN_PLANNING.
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message