hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Damien Carol (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-7324) CBO: provide a mechanism to test CBO features based on table stats only (w/o table data)
Date Tue, 01 Jul 2014 12:55:24 GMT

     [ https://issues.apache.org/jira/browse/HIVE-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Damien Carol updated HIVE-7324:
-------------------------------

    Description: 
Since lot of the CBO work is focused on planning, it will be nice to be able to run explain
query to test CBO features. TPCDS has a rich enough schema and query set. So the patch loads
a dump TPCDS(Scale 10000) stats.

1. TestCBO shows a way to load stats from a dump and run explain on a tpcds query. The output
is currently dumped to Sys.out. This can be improved by hooking to QTestUtil, but hopefully
this is a good start.

2. Uncovered couple of issues in the process of testing this:
a) PartitionPruner fails on 'true' constants. For e.g. you will get an error for 
{code:sql}
SELECT * 
FROM t WHERE
partCol < 100 AND true
{code}
This gets exposed because the predicates coming out of Optiq can contain 'true' predicates.
b) OpTraitsRulesProcFactory:checkBucketedTable checks that number of files = numBuckets. This
fails because there are no dataFiles. So I have altered it to catch exceptions and assume
bucketMapJoinConvertible = false if an exception is encountered here.
Uploading with these changes in this patch for now. Will carve them out as separate patches.

[~ashutoshc], [~hagleitn] can you please take a look. 



  was:
Since lot of the CBO work is focused on planning, it will be nice to be able to run explain
query to test CBO features. TPCDS has a rich enough schema and query set. So the patch loads
a dump TPCDS(Scale 10000) stats.

1. TestCBO shows a way to load stats from a dump and run explain on a tpcds query. The output
is currently dumped to Sys.out. This can be improved by hooking to QTestUtil, but hopefully
this is a good start.

2. Uncovered couple of issues in the process of testing this:
a) PartitionPruner fails on 'true' constants. For e.g. you will get an error for 
{code}
select * from t where partCol < 100 and true
{code}
This gets exposed because the predicates coming out of Optiq can contain 'true' predicates.
b) OpTraitsRulesProcFactory:checkBucketedTable checks that number of files = numBuckets. This
fails because there are no dataFiles. So I have altered it to catch exceptions and assume
bucketMapJoinConvertible = false if an exception is encountered here.
Uploading with these changes in this patch for now. Will carve them out as separate patches.

[~ashutoshc], [~hagleitn] can you please take a look. 




> CBO: provide a mechanism to test CBO features based on table stats only (w/o table data)
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-7324
>                 URL: https://issues.apache.org/jira/browse/HIVE-7324
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Harish Butani
>            Assignee: Harish Butani
>         Attachments: HIVE-7324.1.patch
>
>
> Since lot of the CBO work is focused on planning, it will be nice to be able to run explain
query to test CBO features. TPCDS has a rich enough schema and query set. So the patch loads
a dump TPCDS(Scale 10000) stats.
> 1. TestCBO shows a way to load stats from a dump and run explain on a tpcds query. The
output is currently dumped to Sys.out. This can be improved by hooking to QTestUtil, but hopefully
this is a good start.
> 2. Uncovered couple of issues in the process of testing this:
> a) PartitionPruner fails on 'true' constants. For e.g. you will get an error for 
> {code:sql}
> SELECT * 
> FROM t WHERE
> partCol < 100 AND true
> {code}
> This gets exposed because the predicates coming out of Optiq can contain 'true' predicates.
> b) OpTraitsRulesProcFactory:checkBucketedTable checks that number of files = numBuckets.
This fails because there are no dataFiles. So I have altered it to catch exceptions and assume
bucketMapJoinConvertible = false if an exception is encountered here.
> Uploading with these changes in this patch for now. Will carve them out as separate patches.
> [~ashutoshc], [~hagleitn] can you please take a look. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message