hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13260) Bootstrap Tables for fun and profit
Date Wed, 29 Apr 2015 19:00:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519954#comment-14519954

Enis Soztutar commented on HBASE-13260:

This is useful exercise (although, Nick sorry to delay the RC). I did some more experiments
yesterday. A pure FSHLog based proc store can do 1M procedures in ~20 seconds which is more
or less on par with the current WAL based one. For figuring out where the bottleneck is, I've
changed the region based proc store to instead use {{numShards}} tables and do simple sharding.

With 8+ shards, the write throughput is 25-30 secs (compared to ~20secs). 
4 shards:
java.lang.AssertionError: Wrote 1000000 procedures with 50 threads with useProcV2Wal=false
hsync=false in 2mins, 7.265sec (127.265sec)
8 shards:
java.lang.AssertionError: Wrote 1000000 procedures with 50 threads with useProcV2Wal=false
hsync=false in 31.0280sec (31.028sec)
16 shards:
java.lang.AssertionError: Wrote 1000000 procedures with 50 threads with useProcV2Wal=false
hsync=false in 25.4330sec (25.433sec)
32 shards: 
java.lang.AssertionError: Wrote 1000000 procedures with 50 threads with useProcV2Wal=false
hsync=false in 25.9470sec (25.947sec)

So it seems that the bottleneck is not on the CPU, but the HRegion's concurrency. I've also
done some basic testing with ASYNC_WAL and SKIP_WAL which surprisingly was slower than SYNC_WAL.
I think it is an area worth digging into later. Maybe I am still missing something (code is
at https://github.com/enis/hbase/tree/hbase-13260-review)

I am not suggesting that we do sharding for the store only to get around the region concurrency
problem. Any improvement in this is definitely a big win for both regular data and proc metadata,
but I am not sure whether we can get there soon enough. 

I like the idea of different stores for different kinds of procedures. We already keep (some)
assignment state in meta (in zk-less AM) which is kind of like custom proc on a meta proc
store. In an alternative world, we could have used the meta table for everything (table descriptors,
table state, assignments, list of region files) and be done with it.

For the less amount of work though, I think we should chose one and stick with it. Otherwise,
we have to support two alternative code paths, migration code, upgrading etc. It is just wasted
effort I think. Whether to go with the wal based one or region based one is a question of
the design of proc-based assignment since for DDL ops it does not matter. Unfortunately it
is not formalized yet. If we end up splitting meta, we can even do proc store on meta. 

If we think that assignment using procs will use the local proc store in master, we should
go with the WAL based one since I don't think doing sharding for the region based one is right.
Otherwise, we should go with the region based one. Sorry this is vague, but since we have
yet to figure out the specifics of the new AM, it is hard to decide. 

> Bootstrap Tables for fun and profit 
> ------------------------------------
>                 Key: HBASE-13260
>                 URL: https://issues.apache.org/jira/browse/HBASE-13260
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0.0, 1.1.0
>         Attachments: hbase-13260_bench.patch, hbase-13260_prototype.patch
> Over at the ProcV2 discussions(HBASE-12439) and elsewhere I was mentioning an idea where
we may want to use regular old regions to store/persist some data needed for HBase master
to operate. 
> We regularly use system tables for storing system data. acl, meta, namespace, quota are
some examples. We also store the table state in meta now. Some data is persisted in zk only
(replication peers and replication state, etc). We are moving away from zk as a permanent
storage. As any self-respecting database does, we should store almost all of our data in HBase
> However, we have an "availability" dependency between different kinds of data. For example
all system tables need meta to be assigned first. All master operations need ns table to be
assigned, etc. 
> For at least two types of data, (1) procedure v2 states, (2) RS groups in HBASE-6721
we cannot depend on meta being assigned since "assignment" itself will depend on accessing
this data. The solution in (1) is to implement a custom WAL format, and custom recover lease
and WAL recovery. The solution in (2) is to have the table to store this data, but also cache
it in zk for bootrapping initial assignments. 
> For solving both of the above (and possible future use cases if any), I propose we add
a "boostrap table" concept, which is: 
>  - A set of predefined tables hosted in a separate dir in HDFS. 
>  - A table is only 1 region, not splittable 
>  - Not assigned through regular assignment 
>  - Hosted only on 1 server (typically master)
>  - Has a dedicated WAL. 
>  - A service does WAL recovery + fencing for these tables. 
> This has the benefit of using a region to keep the data, but frees us to re-implement
caching and we can use the same WAL / Memstore / Recovery mechanisms that are battle-tested.


This message was sent by Atlassian JIRA

View raw message