hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18167) OfflineMetaRepair tool may cause HMaster abort always
Date Tue, 20 Jun 2017 14:16:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16055822#comment-16055822
] 

Hadoop QA commented on HBASE-18167:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green}
The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color}
| {color:green} The patch appears to include 1 new or modified test files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s {color} | {color:blue}
Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 57s {color}
| {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s {color} |
{color:green} branch-1 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s {color} |
{color:green} branch-1 passed with JDK v1.7.0_131 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 46s {color}
| {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 56s {color}
| {color:green} branch-1 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 14s {color} | {color:red}
hbase-client in branch-1 failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 12s {color} | {color:red}
hbase-client in branch-1 failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 17s {color} | {color:red}
hbase-server in branch-1 failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 18s {color} | {color:red}
hbase-server in branch-1 failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s {color} |
{color:green} branch-1 passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s {color} |
{color:green} branch-1 passed with JDK v1.7.0_131 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue}
Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} |
{color:green} the patch passed with JDK v1.8.0_131 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s {color} | {color:green}
hbase-client in the patch passed with JDK v1.8.0_131. {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s {color} | {color:green}
hbase-server-jdk1.8.0_131 with JDK v1.8.0_131 generated 0 new + 5 unchanged - 5 fixed = 5
total (was 10) {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s {color} |
{color:green} the patch passed with JDK v1.7.0_131 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s {color} | {color:green}
hbase-client in the patch passed with JDK v1.7.0_131. {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s {color} | {color:green}
hbase-server-jdk1.7.0_131 with JDK v1.7.0_131 generated 0 new + 5 unchanged - 5 fixed = 5
total (was 10) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 15m 23s {color}
| {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2
2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 24s {color}
| {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s {color} | {color:red}
hbase-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 15s {color} | {color:red}
hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} |
{color:green} hbase-client-jdk1.8.0_131 with JDK v1.8.0_131 generated 0 new + 13 unchanged
- 13 fixed = 13 total (was 26) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} |
{color:green} hbase-server-jdk1.8.0_131 with JDK v1.8.0_131 generated 0 new + 3 unchanged
- 3 fixed = 3 total (was 6) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} |
{color:green} hbase-client-jdk1.7.0_131 with JDK v1.7.0_131 generated 0 new + 13 unchanged
- 13 fixed = 13 total (was 26) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} |
{color:green} hbase-server-jdk1.7.0_131 with JDK v1.7.0_131 generated 0 new + 3 unchanged
- 3 fixed = 3 total (was 6) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 55s {color} | {color:green}
hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 24s {color} | {color:red}
hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 39s {color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.ipc.TestSimpleRpcScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:395d9a0 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12873679/HBASE-18167-branch-1-V2.patch
|
| JIRA Issue | HBASE-18167 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  hbaseanti  checkstyle
 compile  |
| uname | Linux c7f36e7fbc56 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/hbase.sh |
| git revision | branch-1 / dead08d |
| Default Java | 1.7.0_131 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_131 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_131
|
| findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/7249/artifact/patchprocess/branch-findbugs-hbase-client.txt
|
| findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/7249/artifact/patchprocess/branch-findbugs-hbase-client.txt
|
| findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/7249/artifact/patchprocess/branch-findbugs-hbase-server.txt
|
| findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/7249/artifact/patchprocess/branch-findbugs-hbase-server.txt
|
| findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/7249/artifact/patchprocess/patch-findbugs-hbase-client.txt
|
| findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/7249/artifact/patchprocess/patch-findbugs-hbase-server.txt
|
| unit | https://builds.apache.org/job/PreCommit-HBASE-Build/7249/artifact/patchprocess/patch-unit-hbase-server.txt
|
| unit test logs |  https://builds.apache.org/job/PreCommit-HBASE-Build/7249/artifact/patchprocess/patch-unit-hbase-server.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7249/testReport/ |
| modules | C: hbase-client hbase-server U: . |
| Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7249/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> OfflineMetaRepair tool may cause HMaster abort always
> -----------------------------------------------------
>
>                 Key: HBASE-18167
>                 URL: https://issues.apache.org/jira/browse/HBASE-18167
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.4.0, 1.3.1, 1.3.2
>            Reporter: Pankaj Kumar
>            Assignee: Pankaj Kumar
>            Priority: Critical
>             Fix For: 1.4.0
>
>         Attachments: HBASE-18167-branch-1.patch, HBASE-18167-branch-1-V2.patch
>
>
> In the production environment, we met a weird scenario where some Meta table HFile blocks
were missing due to some reason.
> To recover the environment we tried to rebuild the meta using OfflineMetaRepair tool
and restart the cluster, but HMaster couldn't finish it's initialization. It always timed
out as namespace table region was never assigned.
> Steps to reproduce
> ==================
> 1. Assign meta table region to HMaster (it can be on any RS, just to reproduce the  scenario)
> {noformat}
> 	<property>
>             <name>hbase.balancer.tablesOnMaster</name>
>             <value>hbase:meta</value>
>         </property>
> {noformat}
> 2. Start HMaster and RegionServer
> 2. Create two namespace, say "ns1" & "ns2"
> 3. Create two tables "ns1:t1' & "ns2:t1'
> 4. flush 'hbase:meta"
> 5. Stop HMaster (graceful shutdown)
> 6. Kill -9 RegionServer (Abnormal shutdown)
> 7. Run OfflineMetaRepair as follows,
> {noformat}
> 	hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair -fix
> {noformat}
> 8. Restart HMaster and RegionServer
> 9. HMaster will never be able to finish its initialization and abort always with below
message,
> {code}
> 2017-06-06 15:11:07,582 FATAL [Hostname:16000.activeMasterManager] master.HMaster: Unhandled
exception. Starting shutdown.
> java.io.IOException: Timedout 120000ms waiting for namespace table to be assigned
>         at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98)
>         at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:1054)
>         at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:848)
>         at org.apache.hadoop.hbase.master.HMaster.access$600(HMaster.java:199)
>         at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1871)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> Root cause
> ==========
> 1. During HM start up AM assumes that it's a failover scenario based on the existing
old WAL files, so SSH/SCP will split WAL files and assign the holding regions. 
> 2. During SSH/SCP it retrieves the server holding regions from meta/AM's in-memory-state,
but meta only had "regioninfo" entry (as already rebuild by OfflineMetaRepair). So empty region
will be returned and it wont trigger any assignment.
> 3. HMaster which is waiting for namespace table to be assigned will timeout and abort
always.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message