hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiang Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14090) Redo FS layout; let go of tables/regions/stores directory hierarchy in DFS
Date Mon, 27 Feb 2017 09:00:55 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15885393#comment-15885393

Xiang Li commented on HBASE-14090:

Hi [~uagashe], thanks for the summary of the discussion!

Regarding "Discussion on new radically different approach to HBase FS directory layout REDO
work", item 5, b and c:
b. APIs/ operations: CRUD e.g storage.createTable(StorageTableData tableData)...
c. In terms of create operation, above mentioned models and API should support:
    i. Creating table along with all its regions and regions with their stores and storefiles
to be constructed in-memory and single createTable call on storage will suffice to create
those artifacts.
   ii. On the other hand if required, empty table can be created and then empty regions can
be added to it later and then stores with the list of store files can be added to regions
in a table etc.

When talking about CRUD, you provided a example of "C" for create: create table. If so, what
about HBase "Put"´╝čIs it categorized as "U" for update? I think Put should come to "create",
while create table is not a part of CRUD, because CRUD is DML, while create table is DDL(can
be seen by help in hbase shell)

I could get your idea, but strictly speaking, I am not sure if create table is a part of CRUD.
Am I too critical / dogmatic about that?

> Redo FS layout; let go of tables/regions/stores directory hierarchy in DFS
> --------------------------------------------------------------------------
>                 Key: HBASE-14090
>                 URL: https://issues.apache.org/jira/browse/HBASE-14090
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: stack
>            Assignee: Sean Busbey
> Our layout as is won't work if 1M regions; e.g. HDFS will fall over if directories of
hundreds of thousands of files. HBASE-13991 (Humongous Tables) would address this specific
directory problem only by adding subdirs under table dir but there are other issues with our
current layout:
>  * Our table/regions/column family 'facade' has to be maintained in two locations --
in master memory and in the hdfs directory layout -- and the farce needs to be kept synced
or worse, the model management is split between master memory and DFS layout. 'Syncing' in
HDFS has us dropping constructs such as 'Reference' and 'HalfHFiles' on split, 'HFileLinks'
when archiving, and so on. This 'tie' makes it hard to make changes.
>  * While HDFS has atomic rename, useful for fencing and for having files added atomically,
if the model were solely owned by hbase, there are hbase primitives we could make use of --
changes in a row are atomic and coprocessors -- to simplify table transactions and provide
more consistent views of our model to clients; file 'moves' could be a memory operation only
rather than an HDFS call; sharing files between tables/snapshots and when it is safe to remove
them would be simplified if one owner only; and so on.
> This is an umbrella blue-sky issue to discuss what a new layout would look like and how
we might get there. I'll follow up with some sketches of what new layout could look like that
come of some chats a few of us have been having. We are also under the 'delusion' that move
to a new layout could be done as part of a rolling upgrade and that the amount of work involved
is not gargantuan.

This message was sent by Atlassian JIRA

View raw message