hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Selina Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-6492) limit partition number involved in a table scan
Date Thu, 27 Feb 2014 00:23:19 GMT

    [ https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13913795#comment-13913795
] 

Selina Zhang commented on HIVE-6492:
------------------------------------

It is not a rare case when a table has 1000+ partitions. To avoid people issue a query lack
of knowledge how many partitions will be scanned, introducing one more configure variable
"hive.limit.query.max.table.partition" will enable system admin to protect the grid. 

The default value is set to -1 which means no limit. 

This variable will be ignored in the following cases:
1. Simple fetch query with limit : 
    select * from table limit n;
2. Metadata only query: 
    select distinct partition_key from partition_table;

There is one special case: Sometimes BI tools such as Tableau (connected through ODBC driver)
will issue 
   select * from table
at the initial stage to figure out table meta data. It will not hurt the grid because Tableau
will cancel the query after it receives one or two rows. To allow Tableau still can work,
code is added to mark the query client types such as CLIDriver and JDBC. And only allow ODBC-sourced
query go through. 




> limit partition number involved in a table scan
> -----------------------------------------------
>
>                 Key: HIVE-6492
>                 URL: https://issues.apache.org/jira/browse/HIVE-6492
>             Project: Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.12.0
>            Reporter: Selina Zhang
>             Fix For: 0.13.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> To protect the cluster, a new configure variable "hive.limit.query.max.table.partition"
is added to hive configuration to
> limit the table partitions involved in a table scan. 
> The default value will be set to -1 which means there is no limit by default. 
> This variable will not affect "metadata only" query.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message