hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14623) Implement dedicated WAL for system tables
Date Thu, 19 May 2016 14:31:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15291174#comment-15291174
] 

Yu Li commented on HBASE-14623:
-------------------------------

Recently we encountered some issue due to namespace table recovery blocked by wal split of
pre-holding RS, the sequence is like:
1. Many RS, rather than simply the single RS holding namespace, crashed due to temporary network
problem (causing all datanodes of pipeline bad), during a *rolling upgrade*
2. Master restarted before DLS of RS previously holding region of namespace table finished,
stuck and finally aborted due to namespace region online timeout ({{hbase.master.namespace.init.timeout}}
default to 5min), see {{TableNamespaceManager#start}}

I guess if we could add a similar mechanism to split and recover namespace table earlier like
meta table, we could avoid such problem:
{code:title=SplitLogWorker#taskLoop|borderStyle=solid}
      // pick meta wal firstly
      int offset = (int) (Math.random() * paths.size());
      for (int i = 0; i < paths.size(); i++) {
        if (DefaultWALProvider.isMetaFile(paths.get(i))) {
          offset = i;
          break;
        }
      }
{code}

So maybe this is a good reason for this JIRA to go in? Thanks.

> Implement dedicated WAL for system tables
> -----------------------------------------
>
>                 Key: HBASE-14623
>                 URL: https://issues.apache.org/jira/browse/HBASE-14623
>             Project: HBase
>          Issue Type: Sub-task
>          Components: wal
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>              Labels: wal
>             Fix For: 2.0.0
>
>         Attachments: 14623-v1.txt, 14623-v2.txt, 14623-v2.txt, 14623-v2.txt, 14623-v2.txt,
14623-v3.txt, 14623-v4.txt
>
>
> As Stephen suggested in parent JIRA, dedicating separate WAL for system tables (other
than hbase:meta) should be done in new JIRA.
> This task is to fulfill the system WAL separation.
> Below is summary of discussion:
> For system table to have its own WAL, we would recover system table faster (fast log
split, fast log replay). It would probably benefit 
> AssignmentManager on system table region assignment. At this time, the new AssignmentManager
is not planned to change WAL. So the existence of this JIRA is good for overall system, not
specific to AssignmentManager.
> There are 3 strategies for implementing system table WAL:
> 1. one WAL for all non-meta system tables
> 2. one WAL for each non-meta system table
> 3. one WAL for each region of non-meta system table
> Currently most system tables are one region table (only ACL table may become big). Choices
2 and 3 basically are the same.
> From implementation point of view, choices 2 and 3 are cleaner than choice 1 (as we have
already had 1 WAL for META table and we can reuse the logic). With choice 2 or 3, assignment
manager performance should not be impacted and it would be easier for assignment manager to
assign system table region (eg. without waiting for user table log split to complete for assigning
system table region).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message