hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siying Dong (JIRA)" <>
Subject [jira] Updated: (HIVE-1750) Remove Partition Filtering Conditions when Possible
Date Wed, 03 Nov 2010 01:34:26 GMT


Siying Dong updated HIVE-1750:

    Attachment: HIVE-1750.3.patch

1. fix the bug of opOr -> opAnd
2. remove walkEvalExpr, as it is not really needed
3. Instead of removing operator after walking its expression tree, save the operator and remove
them one by one after walking the operation tree. It is to handle the case like:
    from src_tbl
    insert overwrite table tbl1 where ds='1'
    insert overwrite table tbl2 where ds='1';
Removing operators when walking the tree is dangerous in these cases.
4. fix comments and add some. Fix variable name.
5. rename classes for partition condition remover to be different from ppr
6. add some queries in the test: 1. cover opAnd case; 2. cover multiple insert from the same
7. always put pruned partitions into the hash map. (we cannot remove to calling calling partition
pruners multiple places for PartitionConditionRemover is not guaranteed to prune partitions,
for cases like table is not partitioned)
8. PartitionPruner and PartitionConditonRemover to share some codes for evaluating the expression
with partition columns
9. some other code cleaning up

ran the test pcr.q but still running the whole test suites. Will "submit patch" when other
tests pass.

> Remove Partition Filtering Conditions when Possible
> ---------------------------------------------------
>                 Key: HIVE-1750
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Siying Dong
>            Assignee: Siying Dong
>         Attachments: HIVE-1750.1.patch, HIVE-1750.2.patch, HIVE-1750.3.patch
> For some simple queries, partition filtering constraints take 8% of CPU time (now 16%
since we filter twice) even if the result is always true. When possible, we should remove
these constraints to save CPU times.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message