hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mithun Radhakrishnan (JIRA)" <>
Subject [jira] [Commented] (HIVE-9588) Reimplement HCatClientHMSImpl.dropPartitions() with HMSC.dropPartitions()
Date Tue, 10 Feb 2015 00:35:34 GMT


Mithun Radhakrishnan commented on HIVE-9588:

A minor update:

1. Dropping 2K partitions using HCatClient.dropPartitions() used to take 204 seconds for a
managed table on my test setup (with an Oracle backend, and remote metastore). This now takes
83 seconds.
2. Dropping 5K partitions used to take about 7 minutes. It now takes 4.

> Reimplement HCatClientHMSImpl.dropPartitions() with HMSC.dropPartitions()
> -------------------------------------------------------------------------
>                 Key: HIVE-9588
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: HCatalog, Metastore, Thrift API
>    Affects Versions: 0.14.0
>            Reporter: Mithun Radhakrishnan
>            Assignee: Mithun Radhakrishnan
>         Attachments: HIVE-9588.1.patch, HIVE-9588.2.patch
> {{HCatClientHMSImpl.dropPartitions()}} currently has an embarrassingly inefficient implementation.
The partial partition-spec is converted into a filter-string. The partitions are fetched from
the server, and then dropped one by one.
> Here's a reimplementation that uses the {{ExprNode}}-based {{HiveMetaStoreClient.dropPartitions()}}.
It cuts out the excessive back-and-forth between the HMS and the client-side. It also reduces
the memory footprint (from loading all the partitions that are to be dropped). 

This message was sent by Atlassian JIRA

View raw message