asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taewoo Kim (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ASTERIXDB-1246) Unnecessary decor variables of a group-by are not removed until PushProjectDownRule is fired.
Date Sun, 03 Jan 2016 07:05:39 GMT

    [ https://issues.apache.org/jira/browse/ASTERIXDB-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076776#comment-15076776
] 

Taewoo Kim commented on ASTERIXDB-1246:
---------------------------------------

Hello Till. Happy New year. The answer is no. I haven't synced my branch with master after
December.

> Unnecessary decor variables of a group-by are not removed until PushProjectDownRule is
fired.
> ---------------------------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1246
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1246
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>            Assignee: Taewoo Kim
>
> Unnecessary decor variables of a group-by is not removed until PushProjectDownRule is
fired.
> Currently, group-by for a subplan is introduced when IntroduceGroupByForSubplanRule is
fired. At this time, decor variables for the new group-by operator are also added based on
the variable usage after the new group-by operator.
> After this rule, other optimizations might make decor variables unnecessary. One example
is that an assign after group-by can be moved before the group-by operator so that a record
variable (e.g., $$0) that is required for the given assign does not need to be passed through
the group-by operator. These unnecessary decor variables will be removed only when PushProjectDownRule
is fired. 
> As the rule name suggests, PushProjectDownRule rule will be fired only when we have a
project operator in the plan. Currently in my branch (index-only plan branch), this affects
the IntroduceSelectAccessMethodRule, which transforms a plan into indexes-utilization plan.
In this rule, it checks whether the given plan is an index-only plan by checking variables
used after a SELECT operator. If only secondary key and/or primary key are used, then the
given plan is an index-only plan and we can use a secodnary-index search to return SK and
PK. 
> The issue is that IntroduceSelectAccessMethodRule is fired before PushProjectDownRule
and generally there is no project is introduced in the plan before IntroduceSelectAccessMethodRule.
So, these unnecessary decor variables are not used; however, they still sit in the plan so
that the optimizer wrongly decides the given plan as a non-index-only plan. The following
is an example query. If we have a secondary index on count1 (PK:tweetid), then this should
be qualified as an index-only plan for the outer branch. In fact, it doesn't because of unnecessary
decor variables that still sit after some optimizations.
> for $t1 in dataset('TweetMessages')
> where $t1.countA > 0
> return {
> "tweetid1": $t1.tweetid,
> "count1":$t1.countA,
> "t2info": for $t2 in dataset('TweetMessages')
>                         where $t1.countA /* +indexnl */= $t2.tweetid
>                         return {"tweetid2": $t2.tweetid,
>                                 "count2": $t2.countB}
> }
> We can separate PushProjectDownRule rule into two rules: push project down and clean
decor variables. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message