hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dudu Markovitz (JIRA)" <>
Subject [jira] [Commented] (HIVE-6492) limit partition number involved in a table scan
Date Sat, 30 Jul 2016 20:23:20 GMT


Dudu Markovitz commented on HIVE-6492:

Hi guys

Perhaps I'm missing something, but although I understand the the business scenario I can't
say I understand the chosen solution.

Does it make sense to limit the access to all tables by the number of partitions when the
volume of a partition can vary rapidly from table to table? 

Does it make sense to limit all users with a single parameter where there are different groups
of users with different business justifications?

What prevents the users from simply divide their queries to multiple smaller queries?  

Can't a user just change the parameter for his session, removing the limitation?

For various reasons It is strongly recommended not to give the users access to tables themselves
but only to views that masks the tables.
If that approach is taken, a simple filter within the view can solved the issue, e.g. -

create view mytable_v as select * from mytable where create_date >= date '2013-01-01';

> limit partition number involved in a table scan
> -----------------------------------------------
>                 Key: HIVE-6492
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.12.0
>            Reporter: Selina Zhang
>            Assignee: Selina Zhang
>             Fix For: 0.13.0
>         Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, HIVE-6492.3.patch.txt,
HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, HIVE-6492.5.patch.txt, HIVE-6492.6.patch.txt,
>   Original Estimate: 24h
>  Remaining Estimate: 24h
> To protect the cluster, a new configure variable "hive.limit.query.max.table.partition"
is added to hive configuration to
> limit the table partitions involved in a table scan. 
> The default value will be set to -1 which means there is no limit by default. 
> This variable will not affect "metadata only" query.

This message was sent by Atlassian JIRA

View raw message