hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue
Date Fri, 03 Nov 2017 20:09:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16238284#comment-16238284

stack commented on HBASE-12125:

[~churromorales] HBCK presumes how assignment works. It also messes w/ hbase privates. In
hbase2, assignment has been redone such that the Master's in-memory view is definitive --
no more state distributed over fs, zk, and master -- and Master effects any or all change.
Also Master internals have changed. HBCK at a minimum no longer works and at worse, can actually
do damage. TODO is an HBCK2. Shout if you need more detail sir.

> Add Hbck option to check and fix WAL's from replication queue
> -------------------------------------------------------------
>                 Key: HBASE-12125
>                 URL: https://issues.apache.org/jira/browse/HBASE-12125
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 3.0.0
>            Reporter: Virag Kothari
>            Assignee: Vincent Poon
>            Priority: Major
>         Attachments: HBASE-12125.v1.master.patch, HBASE-12125.v2.master.patch, HBASE-12125.v3.master.patch
> The replication source will discard the WAL file in many cases when it encounters an
exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain scenarios,
the replication source should dump the current WAL and move to the next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication queues for
any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the memory of
replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be used to achieve
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL currently read
by replication source) from replication queue. If there is a position associated, it also
seeks to that position and reads an entry from there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by reading them
completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are not present
on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which are corrupted
(based on the findings from softCheck/hardCheck). Also the WAL's are moved to a quarantine
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is first rolled
over and then deals with it in the same way as -fixCorruptedReplicationWAL option

This message was sent by Atlassian JIRA

View raw message