hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14623) Implement dedicated WAL for system tables
Date Tue, 22 Mar 2016 16:12:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15206670#comment-15206670

Ted Yu commented on HBASE-14623:

bq. Why would log split and replay be 'faster'?

For log split, WAL edits for system tables are not mixed with edits from user tables. This
reduces the amount of data to be split greatly.
For WAL replay, the benefit comes from replaying edits for system table ahead of edits for
user table.

bq. Why would we recover the system tables 'faster'?

With WAL replay for system table finishing before replaying edits for user tables, system
table assignment can take place at earlier stage in the cluster recovery.

bq. It would probably benefit AssignmentManager on system table region assignment.

Stephen has more details backing the above statement.

bq. Should we compound all system tables so assignment is easier rather than have a small
system table per domain

Compounding system tables requires understanding of needs for each of the system tables. Not
sure this is within scope of the JIRA.

bq. In particular, how will this not slow down assign (if each system table has to wait on
its own log to finish split

Currently with distributed log splitting, not only would log splitting for system table (such
as hbase:namespace) have to finish, but also log splitting for user tables has to complete
before tables are assigned.
In this regard, there is no slow down in assignment.

bq. Any testing done?

I need to load up tar ball built with patch and see the effect on cluster.

bq. meta WAL handling goes untouched?

That's right.

bq. We in effect copy/paste the hbase:meta handling?

Need to go over the patch in detail (it has been almost 4 months). I started by referencing
hbase:meta handling. Later on, I addressed WAL provider integration where system table is
allowed to have more than one region.

bq. There is a logroller for user-space WALs, one for meta and then another for system tables?

This is an interesting observation. Considering the potential for more system tables to be
added (hbase:backup e.g.), I think it makes sense to have another log roller for the system
tables since the edits for system tables can be quite large.

bq. We have enough threads running in the system already.

Yes. And more is being added along with new features. If you feel strongly about this, I can
merge the system WAL Roller into the one for hbase:meta .

bq. Does that mean .meta is not a sys table?

No. That is not the case. Keeping .meta WAL is mostly for backward compatibility.

bq. No tests?

Let me try to add some test(s) in the next patch.

> Implement dedicated WAL for system tables
> -----------------------------------------
>                 Key: HBASE-14623
>                 URL: https://issues.apache.org/jira/browse/HBASE-14623
>             Project: HBase
>          Issue Type: Sub-task
>          Components: wal
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>              Labels: wal
>             Fix For: 2.0.0
>         Attachments: 14623-v1.txt, 14623-v2.txt, 14623-v2.txt, 14623-v2.txt, 14623-v2.txt,
14623-v3.txt, 14623-v4.txt
> As Stephen suggested in parent JIRA, dedicating separate WAL for system tables (other
than hbase:meta) should be done in new JIRA.
> This task is to fulfill the system WAL separation.
> Below is summary of discussion:
> For system table to have its own WAL, we would recover system table faster (fast log
split, fast log replay). It would probably benefit 
> AssignmentManager on system table region assignment. At this time, the new AssignmentManager
is not planned to change WAL. So the existence of this JIRA is good for overall system, not
specific to AssignmentManager.
> There are 3 strategies for implementing system table WAL:
> 1. one WAL for all non-meta system tables
> 2. one WAL for each non-meta system table
> 3. one WAL for each region of non-meta system table
> Currently most system tables are one region table (only ACL table may become big). Choices
2 and 3 basically are the same.
> From implementation point of view, choices 2 and 3 are cleaner than choice 1 (as we have
already had 1 WAL for META table and we can reuse the logic). With choice 2 or 3, assignment
manager performance should not be impacted and it would be easier for assignment manager to
assign system table region (eg. without waiting for user table log split to complete for assigning
system table region).

This message was sent by Atlassian JIRA

View raw message