hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "BELUGA BEHR (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-19489) Disable stats autogather for external tables
Date Fri, 27 Jul 2018 21:24:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-19489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560351#comment-16560351
] 

BELUGA BEHR edited comment on HIVE-19489 at 7/27/18 9:23 PM:
-------------------------------------------------------------

There is already such a flag and it is mentioned in [HIVE-18743].

My suggestions would be to use this flag (though rename it, I dislike the "do_not_" prefix).

Users could manually set it at the table properties level, but by default it would be set
to 'true' for managed tables and 'false' for external tables.


was (Author: belugabehr):
There is already such a flag and it is mentioned in [HIVE-18743].

My suggestions would be to use this flag (though rename it, I dislike the "do_not_" prefix).

Users could manually set it, but by default it would be set to 'true' for managed tables and
'false' for external tables.

> Disable stats autogather for external tables
> --------------------------------------------
>
>                 Key: HIVE-19489
>                 URL: https://issues.apache.org/jira/browse/HIVE-19489
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Statistics
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>            Priority: Major
>
> Hive auto-gather of table statistics can result in incorrect generation of stats (and
the stats being marked as accurate) in the case of external tables where the data is being
written by external apps.
> To avoid this issue, stats autogather will be disabled on external tables when loading/inserting
into a table with existing data, if HIVE_DISABLE_UNSAFE_EXTERNALTABLE_OPERATIONS is enabled.
In this situation, users should rely on explicitly calling ANALYZE TABLE on their external
tables to make sure the stats are kept up-to-date.
> Autogather of stats will still be allowed to occur on external tables in the case of
INSERT OVERWRITE or LOAD DATA OVERWRITE, since the existing data is being removed and so the
stats calculated on the inserted/loaded data should be accurate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message