hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naveen Gangam (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-21718) Improvement performance of UpdateInputAccessTimeHook
Date Tue, 14 May 2019 19:36:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-21718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Naveen Gangam updated HIVE-21718:
---------------------------------
    Attachment: HIVE-21718.2.patch

> Improvement performance of UpdateInputAccessTimeHook
> ----------------------------------------------------
>
>                 Key: HIVE-21718
>                 URL: https://issues.apache.org/jira/browse/HIVE-21718
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2
>    Affects Versions: 2.1.1
>            Reporter: Naveen Gangam
>            Assignee: Naveen Gangam
>            Priority: Major
>         Attachments: HIVE-21718.2.patch, HIVE-21718.patch
>
>
> Currently, Hive does not update the lastAccessTime property for any entities when a query
accesses them. Thus it has not possible to know when a table was last accessed.
> Hive does provide a configurable hook to HS2 that is execcuted as a pre-query hook prior
to the query being executed. However, this hook is inefficient because for each table or partition
it is attempting to update time for, it executes an "alter table ... " command internally.
This is bad 
> 1) For a query touching 1000's of partitions, this hook takes forever to update them.
> 2) Meanwhile, it is holding up the original query from executing.
> So even though we do not recommend using the hook, because the reward is too little (having
lastAccessTime updated), we realize there is no other means to achieve this.
> Also, we can improve the performance of the hook significantly by adding a new thrift
API on HMS to update the lastAccessTime on the database rows directly instead of going to
HMS front end for 1 entity at time (leading to 1000's of HMS calls that lead to multiple 1000's
of calls to the database).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message