hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-21191) Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).
Date Tue, 02 Oct 2018 05:13:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-21191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635043#comment-16635043

stack commented on HBASE-21191:

Just to note that I just had a situation where meta was not assigned and we went into the
"holding pattern" added here. I was able to do an assign of the meta and namespace using hbck2
and this caused us to move out of the "holding pattern" and continue startup.

> Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals
have been cleared).
> ----------------------------------------------------------------------------------------------------------
>                 Key: HBASE-21191
>                 URL: https://issues.apache.org/jira/browse/HBASE-21191
>             Project: HBase
>          Issue Type: Sub-task
>          Components: amv2
>            Reporter: stack
>            Assignee: stack
>            Priority: Major
>             Fix For: 2.1.1
>         Attachments: HBASE-21191.branch-2.1.001.patch, HBASE-21191.branch-2.1.002.patch,
HBASE-21191.branch-2.1.003.patch, HBASE-21191.branch-2.1.004.patch, HBASE-21191.branch-2.1.005.patch,
HBASE-21191.branch-2.1.006.patch, HBASE-21191.branch-2.1.007.patch
> If the masterprocwals have been removed -- operator error, hdfs dataloss, or because
we have gotten ourselves into a pathological state where we have hundreds of masterprocwals
too process and it is taking too long so we just want to startover -- then master startup
will have a dilemma. Master startup needs hbase:meta to be online. If the masterprocwals have
been removed, there may be no outstanding assign or a servercrashprocedure with coverage for
hbase:meta (I ran into this issue repeatedly in internal testing purging masterprocwals on
a large test cluster). Worse, when master startup cannot find an online hbase:meta, it exits
after exhausting the RPC retries.
> So, we need a holding-pattern for master startup if hbase:meta is not online if only
so an operator can schedule an assign for meta or so they can assign fixup procedures (HBASE-20786
has discussion on why we cannot just auto-schedule an assign of meta).

This message was sent by Atlassian JIRA

View raw message