hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carl Steinbach <>
Subject Re: multi-cluster support in hive
Date Wed, 18 Jan 2012 01:31:13 GMT
Discussion on this topic has moved over to the JIRA ticket:

On Tue, Jan 17, 2012 at 4:01 PM, yongqiang he <>wrote:

> Hi hive-dev,
> We are planning to make hive run across multiple data centers
> (physical clusters). We prefer to use hive metastore to provide a
> unified namespace.
> Tables/partitions can exist in more than one cluster. And one cluster
> is defined as a primary cluster. A primary cluster is a table level
> property. A table T1's primary cluster is C1 meaning :1) C1 contains
> all data that is available in all other clusters. 2) write is only
> allowed in this cluster for table C1. but need to allow exceptions
> here 3) new partitions are only allowed to be created in C1.  4) all
> data changes to T1 happened in the primary cluster should be
> replicated to other clusters if there are any secondary clusters. but
> there should be a conf to disable it as there are some exception
> situations.
> The first thing that needs to be done is to make hive metastore have a
> concept of cluster. And that also means all thrift communication calls
> to metastore need to provide a cluster parameter. So we have there
> options here:
> 1) add a cluster parameter to existing thrift interfaces
> or
> 2) add new interfaces which do exactly the same set of functionalities
> as old ones but using a different name (use _on_cluster suffifx
> maybe?) and have a cluster parameter
> or
> 3) overwrite database name for the purpose of cluster name. And allow
> a table co-exist in multiple databases. But that require to promote
> table to top level citizen, and degrade database. For example, "show
> tables" used to scan all tables in current db, but now need to scan
> all tables in all databases.
> We would like to get more ideas about which one to choose, and we are
> definitely open to other alternatives that we missed here.
> We are also looking for other systems that have solved similar
> problems. If anyone knows such a system, we would like to know.
> Appreciate that!
> This is tracked on jira
> Thanks

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message