hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-13260) Bootstrap Tables for fun and profit
Date Tue, 17 Mar 2015 01:24:38 GMT
Enis Soztutar created HBASE-13260:

             Summary: Bootstrap Tables for fun and profit 
                 Key: HBASE-13260
                 URL: https://issues.apache.org/jira/browse/HBASE-13260
             Project: HBase
          Issue Type: Bug
            Reporter: Enis Soztutar
            Assignee: Enis Soztutar
             Fix For: 2.0.0, 1.1.0

Over at the ProcV2 discussions(HBASE-12439) and elsewhere I was mentioning an idea where we
may want to use regular old regions to store/persist some data needed for HBase master to

We regularly use system tables for storing system data. acl, meta, namespace, quota are some
examples. We also store the table state in meta now. Some data is persisted in zk only (replication
peers and replication state, etc). We are moving away from zk as a permanent storage. As any
self-respecting database does, we should store almost all of our data in HBase itself. 

However, we have an "availability" dependency between different kinds of data. For example
all system tables need meta to be assigned first. All master operations need ns table to be
assigned, etc. 

For at least two types of data, (1) procedure v2 states, (2) RS groups in HBASE-6721 we cannot
depend on meta being assigned since "assignment" itself will depend on accessing this data.
The solution in (1) is to implement a custom WAL format, and custom recover lease and WAL
recovery. The solution in (2) is to have the table to store this data, but also cache it in
zk for bootrapping initial assignments. 

For solving both of the above (and possible future use cases if any), I propose we add a "boostrap
table" concept, which is: 
 - A set of predefined tables hosted in a separate dir in HDFS. 
 - A table is only 1 region, not splittable 
 - Not assigned through regular assignment 
 - Hosted only on 1 server (typically master)
 - Has a dedicated WAL. 
 - A service does WAL recovery + fencing for these tables. 

This has the benefit of using a region to keep the data, but frees us to re-implement caching
and we can use the same WAL / Memstore / Recovery mechanisms that are battle-tested. 


This message was sent by Atlassian JIRA

View raw message