hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashish Thusoo (JIRA)" <>
Subject [jira] Commented: (HIVE-578) Refactor partition pruning code as an optimizer transformation
Date Wed, 15 Jul 2009 21:21:14 GMT


Ashish Thusoo commented on HIVE-578:

Actually semantics with table sampling is a bit tricky in case the sampling function is non
deterministic e.g (rand()). In that case there are two possibilities:

1. The sampling is done on the whole table and then the predicate is applied which implies
no partition pruning.
2. The sampling is done after the partition predicate is applied ie. it is done after pruning.

2 is what we do today - I am ok with keeping that, but just wanted to call that out as a special
case in the partition pruning code.

> Refactor partition pruning code as an optimizer transformation
> --------------------------------------------------------------
>                 Key: HIVE-578
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.3.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>         Attachments: patch-578.txt
> Some bugs with partition pruning have been reported and the correct fix for many of them
is to rewrite the partition pruning code as an optimizer transformation which gets kicked
in after the predicate pushdown code. This refactor also uses the graph walker framework so
that the partition pruning code gets consolidated well with the frameworks and does not work
on the query block but rather works on the operator tree.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message