hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12028) Abort the RegionServer, when one of it's handler threads die
Date Tue, 30 Dec 2014 01:11:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260636#comment-14260636
] 

Hadoop QA commented on HBASE-12028:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12689395/hbase-12028-v5.patch
  against master branch at commit b2eea8cac6cceab323bf79b77e321d24dd5c90c2.
  ATTACHMENT ID: 12689395

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 5 new or modified
tests.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

                {color:red}-1 checkstyle{color}.  The applied patch generated 2086 checkstyle
errors (more than the master's current 2081 errors).

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version
2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 lineLengths{color}.  The patch does not introduce lines longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//artifact/patchprocess/checkstyle-aggregate.html

                Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12251//console

This message is automatically generated.

> Abort the RegionServer, when one of it's handler threads die
> ------------------------------------------------------------
>
>                 Key: HBASE-12028
>                 URL: https://issues.apache.org/jira/browse/HBASE-12028
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Sudarshan Kadambi
>            Assignee: Alicia Ying Shu
>         Attachments: Hbase-12028-v3.patch, Hbase-12028.patch, hbase-12028-v4.patch, hbase-12028-v5.patch
>
>
> Over in HBase-11813, a user identified an issue where in all the RPC handler threads
would exit with StackOverflow errors due to an unchecked recursion-terminating condition.
Our clusters demonstrated the same trace. While the patch posted for HBASE-11813 got our clusters
to be merry again, the breakdown surfaced some larger issues.
> When the RegionServer had all it's RPC handler threads dead, it continued to have regions
assigned it. Clearly, it wouldn't be able to serve reads and writes on those regions. A second
issue was that when a user tried to disable or drop a table, the master would try to communicate
to the regionserver for region unassignment. Since the same handler threads seem to be used
for master <-> RS communication as well, the master ended up hanging on the RS indefinitely.
Eventually, the master stopped responding to all table meta-operations.
> A handler thread should never exit, and if it does, it seems like the more prudent thing
to do would be for the RS to abort. This way, at least recovery can be undertaken and the
regions could be reassigned elsewhere. I also think that the master<->RS communication
should get its own exclusive threadpool, but I'll wait until this issue has been sufficiently
discussed before opening an issue ticket for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message