hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vihang Karajgaonkar (JIRA)" <>
Subject [jira] [Resolved] (HIVE-16452) Database UUID for metastore DB
Date Fri, 12 May 2017 20:07:04 GMT


Vihang Karajgaonkar resolved HIVE-16452.
          Resolution: Fixed
    Target Version/s: 3.0.0

Resolving this as both the sub-tasks are merged.

> Database UUID for metastore DB
> ------------------------------
>                 Key: HIVE-16452
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Metastore
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
> In cloud environments it is possible that a same database instance is used as the long
running metadata persistence layer and multiple HMS access this database. These HMS instances
could be running the same time or in case of transient workloads come up on an on-demand basis.
HMS is used by multiple projects in the Hadoop eco-system as the de-facto metadata keeper
for various SQL engines on the cluster. Currently, there is no way to uniquely identify the
database instance which is backing the HMS. For example, if there are two instances of HMS
running on top of same metastore DB, there is no way to identify that data received from both
the metastore clients is coming from the same database. Similarly, if there in case of transient
workloads multiple HMS services come up and go, a external application which is fetching data
from a HMS has no way to identify that these multiple instances of HMS are in fact returning
the same data. 
> We can potentially use the combination of javax.jdo.option.ConnectionURL, javax.jdo.option.ConnectionDriverName
configuration of each HMS instance but this is approach may not be very robust. If the database
is migrated to another server for some reason the ConnectionURL can change. Having a UUID
in the metastore DB which can be queried using a Thrift API can help solve this problem. This
way any application talking to multiple HMS instances can recognize if the data is coming
the same backing database.

This message was sent by Atlassian JIRA

View raw message