hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13317) Region server reportForDuty stuck looping if there is a master change
Date Tue, 31 Mar 2015 01:52:54 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387809#comment-14387809
] 

Hadoop QA commented on HBASE-13317:
-----------------------------------

{color:green}+1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12708263/HBASE-13317-branch-1-v5.patch
  against branch-1 branch at commit 55a5a3be33b3cea03112d55ad06c01440639ec54.
  ATTACHMENT ID: 12708263

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 2 new or modified
tests.

    {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions
(2.4.1 2.5.2 2.6.0)

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 protoc{color}.  The applied patch does not increase the total number of
protoc compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 checkstyle{color}.  The applied patch does not increase the total number
of checkstyle errors

    {color:green}+1 findbugs{color}.  The patch does not introduce any  new Findbugs (version
2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 lineLengths{color}.  The patch does not introduce lines longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13497//testReport/
Release Findbugs (version 2.0.3) 	warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13497//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13497//artifact/patchprocess/checkstyle-aggregate.html

  Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13497//console

This message is automatically generated.

> Region server reportForDuty stuck looping if there is a master change
> ---------------------------------------------------------------------
>
>                 Key: HBASE-13317
>                 URL: https://issues.apache.org/jira/browse/HBASE-13317
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 1.0.0, 2.0.0, 0.98.12
>            Reporter: Jerry He
>            Assignee: Jerry He
>             Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13
>
>         Attachments: HBASE-13317-0.98-v2.patch, HBASE-13317-0.98-v3.patch, HBASE-13317-0.98-v4.patch,
HBASE-13317-0.98-v5.patch, HBASE-13317-0.98.patch, HBASE-13317-branch-1-v5.patch, HBASE-13317-master-v5.patch
>
>
> During cluster startup, region server reportForDuty gets stuck looping if there is a
master change.
> {noformat}
> 2015-03-22 11:15:16,186 INFO  [regionserver60020] regionserver.HRegionServer: reportForDuty
to master=bigaperf274,60000,1427045883965 with port=60020, startcode=1427048115174
> 2015-03-22 11:15:16,272 WARN  [regionserver60020] regionserver.HRegionServer: error telling
master we are up
> com.google.protobuf.ServiceException: java.net.ConnectException: Connection refused
> 	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1678)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
> 	at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8277)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2137)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:896)
> 	at java.lang.Thread.run(Thread.java:745)
> 2015-03-22 11:15:16,274 WARN  [regionserver60020] regionserver.HRegionServer: reportForDuty
failed; sleeping and then retrying.
> 2015-03-22 11:15:19,274 INFO  [regionserver60020] regionserver.HRegionServer: reportForDuty
to master=bigaperf273,60000,1427048108439 with port=60020, startcode=1427048115174
> 2015-03-22 11:15:19,275 WARN  [regionserver60020] regionserver.HRegionServer: error telling
master we are up
> com.google.protobuf.ServiceException: java.net.ConnectException: Connection refused
> 	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1678)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
> 	at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8277)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2137)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:896)
> 	at java.lang.Thread.run(Thread.java:745)
> 2015-03-22 11:15:19,276 WARN  [regionserver60020] regionserver.HRegionServer: reportForDuty
failed; sleeping and then retrying.
> 2015-03-22 11:15:22,276 INFO  [regionserver60020] regionserver.HRegionServer: reportForDuty
to master=bigaperf273,60000,1427048108439 with port=60020, startcode=1427048115174
> 2015-03-22 11:15:22,296 DEBUG [regionserver60020] regionserver.HRegionServer: Master
is not running yet
> 2015-03-22 11:15:22,296 WARN  [regionserver60020] regionserver.HRegionServer: reportForDuty
failed; sleeping and then retrying.
> 2015-03-22 11:15:25,296 INFO  [regionserver60020] regionserver.HRegionServer: reportForDuty
to master=bigaperf273,60000,1427048108439 with port=60020, startcode=1427048115174
> 2015-03-22 11:15:25,299 DEBUG [regionserver60020] regionserver.HRegionServer: Master
is not running yet
> 2015-03-22 11:15:25,299 WARN  [regionserver60020] regionserver.HRegionServer: reportForDuty
failed; sleeping and then retrying.
> 2015-03-22 11:15:28,299 INFO  [regionserver60020] regionserver.HRegionServer: reportForDuty
to master=bigaperf273,60000,1427048108439 with port=60020, startcode=1427048115174
> 2015-03-22 11:15:28,302 DEBUG [regionserver60020] regionserver.HRegionServer: Master
is not running yet
> 2015-03-22 11:15:28,302 WARN  [regionserver60020] regionserver.HRegionServer: reportForDuty
failed; sleeping and then retrying.
> {noformat}
> What happended is the region server first got master=bigaperf274,60000,1427045883965.
 Before it was able to report successfully, the maser changed to bigaperf273,60000,1427048108439.
> We were supposed to open a new connection to the new master. But we never did, looping
and trying to old address forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message