hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Maughan (JIRA)" <>
Subject [jira] [Created] (HIVE-12158) Add methods to HCatClient for partition synchronization
Date Tue, 13 Oct 2015 09:40:05 GMT
David Maughan created HIVE-12158:

             Summary: Add methods to HCatClient for partition synchronization
                 Key: HIVE-12158
             Project: Hive
          Issue Type: Improvement
          Components: HCatalog
            Reporter: David Maughan
            Priority: Minor

We have a use case where we have a list of partitions that are created as a result of a batch
job (new or updated) outside of Hive and would like to synchronize them with the Hive MetaStore.
We would like to use the HCatalog {{HCatClient}} but it currently does not seem to support
this. However it is possible with the {{HiveMetaStoreClient}} directly. I am proposing to
add the following methods to {{HCatClient}} and {{HCatClientHMSImpl}}:

1. A method for altering partitions. The implementation would delegate to {{HiveMetaStoreClient#alter_partitions}}.
I've used "update" instead of "alter" in the name so it's consistent with the {{HCatClient#updateTableSchema}}

public void updatePartitions(List<HCatPartition> partitions) throws HCatException

2. A method for altering or adding partitions depending on whether they already exist or not.
The implementation would split the given list into a list of existing partitions (using {{HiveMetaStoreClient#getPartitionsByNames}}
and {{Warehouse#makePartName}} to determine existence), and a list of new partitions. Then
the appropriate add/update calls would be issued:

public void addOrUpdatePartitions(List<HCatPartition> partitions) throws HCatException

Are these acceptable? Are there any standards that should be followed here?

This message was sent by Atlassian JIRA

View raw message