hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17265) Region left unassigned in master failover when failed open
Date Tue, 07 Feb 2017 13:32:42 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15855982#comment-15855982
] 

Hadoop QA commented on HBASE-17265:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 32m 29s {color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green}
The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red}
The patch doesn't appear to include any new or modified tests. Please justify why no new tests
are needed for this patch. Also please list what manual steps were performed to verify this
patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 22s {color}
| {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s {color} |
{color:green} branch-1 passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s {color} |
{color:green} branch-1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s {color}
| {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color}
| {color:green} branch-1 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 23s {color} | {color:red}
hbase-server in branch-1 has 2 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s {color} |
{color:green} branch-1 passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s {color} |
{color:green} branch-1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s {color} |
{color:green} the patch passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s {color} |
{color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 24m 1s {color}
| {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2
2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 24s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 38s {color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s {color} |
{color:green} the patch passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s {color} |
{color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 149m 35s {color} | {color:red}
hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s {color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 227m 14s {color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.master.TestMasterBalanceThrottling |
|   | hadoop.hbase.mapreduce.TestSecureLoadIncrementalHFiles |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:e01ee2f |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12851332/HBASE-17265-branch-1.v2.patch
|
| JIRA Issue | HBASE-17265 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  hbaseanti  checkstyle
 compile  |
| uname | Linux d96f5ab7e375 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/hbase.sh |
| git revision | branch-1 / 0a0aef3 |
| Default Java | 1.7.0_80 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_121 /usr/lib/jvm/java-7-oracle:1.7.0_80
|
| findbugs | v3.0.0 |
| findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/5615/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html
|
| unit | https://builds.apache.org/job/PreCommit-HBASE-Build/5615/artifact/patchprocess/patch-unit-hbase-server.txt
|
| unit test logs |  https://builds.apache.org/job/PreCommit-HBASE-Build/5615/artifact/patchprocess/patch-unit-hbase-server.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/5615/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/5615/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Region left unassigned in master failover when failed open
> ----------------------------------------------------------
>
>                 Key: HBASE-17265
>                 URL: https://issues.apache.org/jira/browse/HBASE-17265
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>    Affects Versions: 1.1.7
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>         Attachments: HBASE-17265-branch-1.patch, HBASE-17265-branch-1.v2.patch
>
>
> This problem is very similar with HBASE-13330. It is also a result of ServerShutdownHandler
and AssignmentManager 'thought' the region will be assigned by each other, and left the region
remain unassigned.
> But HBASE-13330 only dealed with RS_ZK_REGION_FAILED_OPEN in {{processRegionInTransition}}.
 
> Region failed open may happen after {{processRegionInTransition}}. In my case, when master
failover, it assigned all RIT regions, but some are failed to open(due to HBASE-17264), AssignmentManager
received the zk event, and skip to assign it(this region was opened on a failed server before
and already in RIT before master failover). The SSH also skip to assign it because it was
RIT on another RS.
> Master recevied a zk event of RS_ZK_REGION_FAILED_OPEN and begin to handle it:
> {noformat}
> 2016-11-23 17:11:46,944 DEBUG [AM.ZK.Worker-pool2-t1] master.AssignmentManager: Handling
RS_ZK_REGION_FAILED_OPEN, server=example.org,30003,1479780976834, region=57513956a7b671f4e8da1598c2e2970e,
current_state={57513956a7b671f4e8da1598c2e2970e state=PENDING_OPEN, ts=1479892306843, server=example.org,30003,1479780976834}
> 2016-11-23 17:11:46,944 INFO  [AM.ZK.Worker-pool2-t1] master.RegionStates: Transition
{57513956a7b671f4e8da1598c2e2970e state=PENDING_OPEN, ts=1479892306843, server=example.org,30003,1479780976834}
to {57513956a7b671f4e8da1598c2e2970e state=CLOSED, ts=1479892306944, server=example.org,30003,1479780976834}
> 2016-11-23 17:11:46,945 WARN  [AM.ZK.Worker-pool2-t1] master.RegionStates: 57513956a7b671f4e8da1598c2e2970e
moved to CLOSED on example.org,30003,1479780976834, expected example.org,30003,1475893095003
> 2016-11-23 17:11:46,950 DEBUG [AM.ZK.Worker-pool2-t1] master.AssignmentManager: Found
an existing plan for test,QFO7M,1475986053104.57513956a7b671f4e8da1598c2e2970e. destination
server is example.org,30003,1479780976834 accepted as a dest server = false
> 2016-11-23 17:11:47,012 DEBUG [AM.ZK.Worker-pool2-t1] master.AssignmentManager: No previous
transition plan found (or ignoring an existing plan) for test,QFO7M,1475986053104.57513956a7b671f4e8da1598c2e2970e.;
generated random plan=hri=test,QFO7M,1475986053104.57513956a7b671f4e8da1598c2e2970e., src=,
dest=11.239.21.235,30003,1479781410131; 2 (online=3) available servers, forceNewPlan=true
> 2016-11-23 17:11:47,014 DEBUG [AM.ZK.Worker-pool2-t1] handler.ClosedRegionHandler: Handling
CLOSED event for 57513956a7b671f4e8da1598c2e2970e
> 2016-11-23 17:11:47,015 WARN  [AM.ZK.Worker-pool2-t1] master.RegionStates: 57513956a7b671f4e8da1598c2e2970e
moved to CLOSED on example.org,30003,1479780976834, expected example.org,30003,1475893095003
> {noformat}
> AssignmentManager skip to assign it because the region was on a failed server
> {noformat}
> 2016-11-23 17:11:47,017 INFO  [AM.ZK.Worker-pool2-t1] master.AssignmentManager: Skip
assigning test,QFO7M,1475986053104.57513956a7b671f4e8da1598c2e2970e., it's host example.org,30003,1475893095003
is dead but not processed yet
> {noformat}
> SSH also skip it because it was RIT on another server
> {noformat}
> 2016-11-23 17:12:17,850 INFO  [MASTER_SERVER_OPERATIONS-example.org:30001-0] master.RegionStates:
Transitioning {57513956a7b671f4e8da1598c2e2970e state=CLOSED, ts=1479892307015, server=example.org,30003,1479780976834}
will be handled by SSH for example.org,30003,1475893095003
> 2016-11-23 17:12:17,910 INFO  [MASTER_SERVER_OPERATIONS-example.org:30001-0] handler.ServerShutdownHandler:
Skip assigning region in transition on other server{57513956a7b671f4e8da1598c2e2970e state=CLOSED,
ts=1479892307015, server=example.org,30003,1479780976834}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message