Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 922E717373 for ; Wed, 29 Apr 2015 21:23:12 +0000 (UTC) Received: (qmail 28301 invoked by uid 500); 29 Apr 2015 21:23:07 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 28229 invoked by uid 500); 29 Apr 2015 21:23:07 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 28217 invoked by uid 99); 29 Apr 2015 21:23:07 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Apr 2015 21:23:07 +0000 Date: Wed, 29 Apr 2015 21:23:07 +0000 (UTC) From: "Enis Soztutar (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-13260) Bootstrap Tables for fun and profit MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-13260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520276#comment-14520276 ] Enis Soztutar commented on HBASE-13260: --------------------------------------- It is mostly in the 1M regions jira, and some offline discussion with Stack. > Bootstrap Tables for fun and profit > ------------------------------------ > > Key: HBASE-13260 > URL: https://issues.apache.org/jira/browse/HBASE-13260 > Project: HBase > Issue Type: Bug > Reporter: Enis Soztutar > Assignee: Enis Soztutar > Fix For: 2.0.0, 1.1.0 > > Attachments: hbase-13260_bench.patch, hbase-13260_prototype.patch > > > Over at the ProcV2 discussions(HBASE-12439) and elsewhere I was mentioning an idea where we may want to use regular old regions to store/persist some data needed for HBase master to operate. > We regularly use system tables for storing system data. acl, meta, namespace, quota are some examples. We also store the table state in meta now. Some data is persisted in zk only (replication peers and replication state, etc). We are moving away from zk as a permanent storage. As any self-respecting database does, we should store almost all of our data in HBase itself. > However, we have an "availability" dependency between different kinds of data. For example all system tables need meta to be assigned first. All master operations need ns table to be assigned, etc. > For at least two types of data, (1) procedure v2 states, (2) RS groups in HBASE-6721 we cannot depend on meta being assigned since "assignment" itself will depend on accessing this data. The solution in (1) is to implement a custom WAL format, and custom recover lease and WAL recovery. The solution in (2) is to have the table to store this data, but also cache it in zk for bootrapping initial assignments. > For solving both of the above (and possible future use cases if any), I propose we add a "boostrap table" concept, which is: > - A set of predefined tables hosted in a separate dir in HDFS. > - A table is only 1 region, not splittable > - Not assigned through regular assignment > - Hosted only on 1 server (typically master) > - Has a dedicated WAL. > - A service does WAL recovery + fencing for these tables. > This has the benefit of using a region to keep the data, but frees us to re-implement caching and we can use the same WAL / Memstore / Recovery mechanisms that are battle-tested. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)