hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lefty Leverenz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-16452) Database UUID for metastore DB
Date Fri, 12 May 2017 22:12:04 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008800#comment-16008800
] 

Lefty Leverenz commented on HIVE-16452:
---------------------------------------

Ummm, I was hoping you'd figure out where to put the docs.  ;)

Perhaps we need a new wiki page for APIs.  In the meantime, the APIs Overview might be the
best place -- either the Metastore (Java) section or a new section.

Another possibility is the Metastore Administration page.  A new section could go after the
list of supported databases, or it could be a subsection:

* [Metastore Administration -- Supported Backend Databases | https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-SupportedBackendDatabasesforMetastore]

Wherever it goes, let's have a crossreference from the other page(s).

> Database UUID for metastore DB
> ------------------------------
>
>                 Key: HIVE-16452
>                 URL: https://issues.apache.org/jira/browse/HIVE-16452
>             Project: Hive
>          Issue Type: New Feature
>          Components: Metastore
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>
> In cloud environments it is possible that a same database instance is used as the long
running metadata persistence layer and multiple HMS access this database. These HMS instances
could be running the same time or in case of transient workloads come up on an on-demand basis.
HMS is used by multiple projects in the Hadoop eco-system as the de-facto metadata keeper
for various SQL engines on the cluster. Currently, there is no way to uniquely identify the
database instance which is backing the HMS. For example, if there are two instances of HMS
running on top of same metastore DB, there is no way to identify that data received from both
the metastore clients is coming from the same database. Similarly, if there in case of transient
workloads multiple HMS services come up and go, a external application which is fetching data
from a HMS has no way to identify that these multiple instances of HMS are in fact returning
the same data. 
> We can potentially use the combination of javax.jdo.option.ConnectionURL, javax.jdo.option.ConnectionDriverName
configuration of each HMS instance but this is approach may not be very robust. If the database
is migrated to another server for some reason the ConnectionURL can change. Having a UUID
in the metastore DB which can be queried using a Thrift API can help solve this problem. This
way any application talking to multiple HMS instances can recognize if the data is coming
the same backing database.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message