asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Westmann (JIRA)" <>
Subject [jira] [Commented] (ASTERIXDB-1246) Unnecessary decor variables of a group-by are not removed until PushProjectDownRule is fired.
Date Sun, 03 Jan 2016 06:47:47 GMT


Till Westmann commented on ASTERIXDB-1246:

[~wangsaeu] I assume that you see this behavior after [~buyingyi]'s commit e3e13735b760491482ac7dd680dec58c5f635c16
on master.
Is that correct?

> Unnecessary decor variables of a group-by are not removed until PushProjectDownRule is
> ---------------------------------------------------------------------------------------------
>                 Key: ASTERIXDB-1246
>                 URL:
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>            Assignee: Taewoo Kim
> Unnecessary decor variables of a group-by is not removed until PushProjectDownRule is
> Currently, group-by for a subplan is introduced when IntroduceGroupByForSubplanRule is
fired. At this time, decor variables for the new group-by operator are also added based on
the variable usage after the new group-by operator.
> After this rule, other optimizations might make decor variables unnecessary. One example
is that an assign after group-by can be moved before the group-by operator so that a record
variable (e.g., $$0) that is required for the given assign does not need to be passed through
the group-by operator. These unnecessary decor variables will be removed only when PushProjectDownRule
is fired. 
> As the rule name suggests, PushProjectDownRule rule will be fired only when we have a
project operator in the plan. Currently in my branch (index-only plan branch), this affects
the IntroduceSelectAccessMethodRule, which transforms a plan into indexes-utilization plan.
In this rule, it checks whether the given plan is an index-only plan by checking variables
used after a SELECT operator. If only secondary key and/or primary key are used, then the
given plan is an index-only plan and we can use a secodnary-index search to return SK and
> The issue is that IntroduceSelectAccessMethodRule is fired before PushProjectDownRule
and generally there is no project is introduced in the plan before IntroduceSelectAccessMethodRule.
So, these unnecessary decor variables are not used; however, they still sit in the plan so
that the optimizer wrongly decides the given plan as a non-index-only plan. The following
is an example query. If we have a secondary index on count1 (PK:tweetid), then this should
be qualified as an index-only plan for the outer branch. In fact, it doesn't because of unnecessary
decor variables that still sit after some optimizations.
> for $t1 in dataset('TweetMessages')
> where $t1.countA > 0
> return {
> "tweetid1": $t1.tweetid,
> "count1":$t1.countA,
> "t2info": for $t2 in dataset('TweetMessages')
>                         where $t1.countA /* +indexnl */= $t2.tweetid
>                         return {"tweetid2": $t2.tweetid,
>                                 "count2": $t2.countB}
> }
> We can separate PushProjectDownRule rule into two rules: push project down and clean
decor variables. 

This message was sent by Atlassian JIRA

View raw message