hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17264) Process RIT with offline state will always fail to open in the first time
Date Tue, 06 Dec 2016 16:59:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15726052#comment-15726052
] 

Hadoop QA commented on HBASE-17264:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue}
Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green}
The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color}
| {color:green} The patch appears to include 1 new or modified test files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 35s {color}
| {color:green} branch-1.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} |
{color:green} branch-1.1 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} |
{color:green} branch-1.1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color}
| {color:green} branch-1.1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color}
| {color:green} branch-1.1 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 49s {color} | {color:red}
hbase-server in branch-1.1 has 81 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 27s {color} | {color:red}
hbase-server in branch-1.1 failed with JDK v1.8.0_111. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} |
{color:green} branch-1.1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s {color} |
{color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s {color} |
{color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s {color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s {color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 12m 2s {color}
| {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2
2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 15s {color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s {color} |
{color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 26s {color} | {color:red}
hbase-server in the patch failed with JDK v1.8.0_111. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} |
{color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 53s {color} | {color:red}
hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s {color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 121m 21s {color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.master.handler.TestEnableTableHandler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:8012383 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12841968/HBASE-17264-branch-1.1.patch
|
| JIRA Issue | HBASE-17264 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  hbaseanti  checkstyle
 compile  |
| uname | Linux 5c21ce425b03 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/hbase.sh |
| git revision | branch-1.1 / baf9409 |
| Default Java | 1.7.0_80 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_111 /usr/lib/jvm/java-7-oracle:1.7.0_80
|
| findbugs | v3.0.0 |
| findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/4811/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html
|
| javadoc | https://builds.apache.org/job/PreCommit-HBASE-Build/4811/artifact/patchprocess/branch-javadoc-hbase-server-jdk1.8.0_111.txt
|
| javadoc | https://builds.apache.org/job/PreCommit-HBASE-Build/4811/artifact/patchprocess/patch-javadoc-hbase-server-jdk1.8.0_111.txt
|
| unit | https://builds.apache.org/job/PreCommit-HBASE-Build/4811/artifact/patchprocess/patch-unit-hbase-server.txt
|
| unit test logs |  https://builds.apache.org/job/PreCommit-HBASE-Build/4811/artifact/patchprocess/patch-unit-hbase-server.txt
|
|  Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/4811/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/4811/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Process RIT with offline state will always fail to open in the first time
> -------------------------------------------------------------------------
>
>                 Key: HBASE-17264
>                 URL: https://issues.apache.org/jira/browse/HBASE-17264
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>    Affects Versions: 1.1.7
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>         Attachments: HBASE-17264-branch-1.1.patch
>
>
> In Assignment#processRegionsInTransition, when handling regions with M_ZK_REGION_OFFLINE
state, we used a handler to reassign this region. But, when calling assign, we passed not
to set the zk node 
> {code}
> case M_ZK_REGION_OFFLINE:
>         // Insert in RIT and resend to the regionserver
>         regionStates.updateRegionState(rt, State.PENDING_OPEN);
>         final RegionState rsOffline = regionStates.getRegionState(regionInfo);
>         this.executorService.submit(
>           new EventHandler(server, EventType.M_MASTER_RECOVERY) {
>             @Override
>             public void process() throws IOException {
>               ReentrantLock lock = locker.acquireLock(regionInfo.getEncodedName());
>               try {
>                 RegionPlan plan = new RegionPlan(regionInfo, null, sn);
>                 addPlan(encodedName, plan);
>                 assign(rsOffline, false, false);  //we decide to not to setOfflineInZK
>               } finally {
>                 lock.unlock();
>               }
>             }
>           });
>         break;
> {code}
> But, when setOfflineInZK is false, we passed a zk node vesion of -1 to the regionserver,
meaning the zk node does not exists. But actually the offline zk node does exist with a different
version. RegionServer will report fail to open because of this.
> This situation is trully happened in our test environment. Though the master will recevied
the FAILED_OPEN zk event and retry later, but due to a another bug(HBASE-17265). The Region
will be remain in closed state forever.
> Master assign region in RIT
> {noformat}
> 2016-11-23 17:11:46,842 INFO  [example.org:30001.activeMasterManager] master.AssignmentManager:
Processing 57513956a7b671f4e8da1598c2e2970e in state: M_ZK_REGION_OFFLINE
> 2016-11-23 17:11:46,842 INFO  [example.org:30001.activeMasterManager] master.RegionStates:
Transition {57513956a7b671f4e8da1598c2e2970e state=OFFLINE, ts=1479892306738, server=example.org,30003,1475893095003}
to {57513956a7b671f4e8da1598c2e2970e state=PENDING_OPEN, ts=1479892306842, server=example.org,30003,1479780976834}
> 2016-11-23 17:11:46,842 INFO  [example.org:30001.activeMasterManager] master.AssignmentManager:
Processed region 57513956a7b671f4e8da1598c2e2970e in state M_ZK_REGION_OFFLINE, on server:
example.org,30003,1479780976834
> 2016-11-23 17:11:46,843 INFO  [MASTER_SERVER_OPERATIONS-example.org:30001-0] master.AssignmentManager:
Assigning test,QFO7M,1475986053104.57513956a7b671f4e8da1598c2e2970e. to example.org,30003,1479780976834
> {noformat}
> RegionServer recevied the open region request, and new a RegionOpenHandler to open the
region, but only to find the RIT node's version is not as it expected. RS transition the RIT
ZK node to failed open in the end
> {noformat}
> 2016-11-23 17:11:46,860 WARN  [RS_OPEN_REGION-example.org:30003-1] coordination.ZkOpenRegionCoordination:
Failed transition from OFFLINE to OPENING for region=57513956a7b671f4e8da1598c2e2970e
> 2016-11-23 17:11:46,861 WARN  [RS_OPEN_REGION-example.org:30003-1] handler.OpenRegionHandler:
Region was hijacked? Opening cancelled for encodedName=57513956a7b671f4e8da1598c2e2970e
> 2016-11-23 17:11:46,860 WARN  [RS_OPEN_REGION-example.org:30003-1] zookeeper.ZKAssign:
regionserver:30003-0x15810b5f633015f, quorum=hbase4dev04.et2sqa:2181,hbase4dev05.et2sqa:2181,hbase4dev06.et2sqa:2181,
baseZNode=/test-hbase11-func2 Attempt to transition the unassigned node for 57513956a7b671f4e8da1598c2e2970e
from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING failed, the node existed but was version
3 not the expected version -1
> {noformat}
> Master recevied this zk event and begin to handle RS_ZK_REGION_FAILED_OPEN
> {noformat}
> 2016-11-23 17:11:46,944 DEBUG [AM.ZK.Worker-pool2-t1] master.AssignmentManager: Handling
RS_ZK_REGION_FAILED_OPEN, server=example.org,30003,1479780976834, region=57513956a7b671f4e8da1598c2e2970e,
current_state={57513956a7b671f4e8da1598c2e2970e state=PENDING_OPEN, ts=1479892306843, server=example.org,30003,1479780976834}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message