Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D4A0011F5F for ; Mon, 25 Aug 2014 15:13:59 +0000 (UTC) Received: (qmail 40460 invoked by uid 500); 25 Aug 2014 15:13:59 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 40409 invoked by uid 500); 25 Aug 2014 15:13:59 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 40166 invoked by uid 99); 25 Aug 2014 15:13:59 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Aug 2014 15:13:59 +0000 Date: Mon, 25 Aug 2014 15:13:59 +0000 (UTC) From: "Hadoop QA (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-11813) CellScanner#advance may infinitely recurse MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-11813?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D141= 09187#comment-14109187 ]=20 Hadoop QA commented on HBASE-11813: ----------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest a= ttachment=20 http://issues.apache.org/jira/secure/attachment/12664156/11813v2.master.t= xt against trunk revision . ATTACHMENT ID: 12664156 {color:green}+1 @author{color}. The patch does not contain any @author= tags. {color:green}+1 tests included{color}. The patch appears to include 3 = new or modified tests. =09=09 {color:red}-1 Anti-pattern{color}. The patch appears to have ant= i-pattern where BYTES_COMPARATOR was omitted: + NavigableMap> m =3D new TreeMap>();. {color:red}-1 javac{color}. The patch appears to cause mvn compile goa= l to fail. Compilation errors resume: [ERROR] COMPILATION ERROR :=20 [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-c= ommon/src/test/java/org/apache/hadoop/hbase/TestCellUtil.java:[79,10] error= : TestCellUtil.TestCell is not abstract and does not override abstract meth= od getTagsLength() in Cell [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-c= ommon/src/test/java/org/apache/hadoop/hbase/TestCellUtil.java:[186,17] erro= r: getTagsLength() in TestCellUtil.TestCell cannot implement getTagsLength(= ) in Cell [ERROR] return type short is not compatible with int [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-c= ommon/src/test/java/org/apache/hadoop/hbase/TestCellUtil.java:[191,4] error= : method does not override or implement a method from a supertype [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plug= in:2.5.1:testCompile (default-testCompile) on project hbase-common: Compila= tion failure: Compilation failure: [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-c= ommon/src/test/java/org/apache/hadoop/hbase/TestCellUtil.java:[79,10] error= : TestCellUtil.TestCell is not abstract and does not override abstract meth= od getTagsLength() in Cell [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-c= ommon/src/test/java/org/apache/hadoop/hbase/TestCellUtil.java:[186,17] erro= r: getTagsLength() in TestCellUtil.TestCell cannot implement getTagsLength(= ) in Cell [ERROR] return type short is not compatible with int [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-c= ommon/src/test/java/org/apache/hadoop/hbase/TestCellUtil.java:[185,4] error= : method does not override or implement a method from a supertype [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-c= ommon/src/test/java/org/apache/hadoop/hbase/TestCellUtil.java:[191,4] error= : method does not override or implement a method from a supertype [ERROR] -> [Help 1] [ERROR]=20 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e= switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR]=20 [ERROR] For more information about the errors and possible solutions, pleas= e read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailu= reException [ERROR]=20 [ERROR] After correcting the problems, you can resume the build with the co= mmand [ERROR] mvn -rf :hbase-common =20 Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10565//= console This message is automatically generated. > CellScanner#advance may infinitely recurse > ------------------------------------------ > > Key: HBASE-11813 > URL: https://issues.apache.org/jira/browse/HBASE-11813 > Project: HBase > Issue Type: Bug > Reporter: Andrew Purtell > Assignee: stack > Priority: Blocker > Fix For: 0.99.0, 2.0.0, 0.98.6 > > Attachments: 11813.098.txt, 11813.098.txt, 11813.master.txt, 1181= 3.master.txt, 11813v2.master.txt, catch_all_exceptions.txt > > > On user@hbase, johannes.schaback@visual-meta.com reported: > {quote} > we face a serious issue with our HBase production cluster for two days no= w. Every couple minutes, a random RegionServer gets stuck and does not proc= ess any requests. In addition this causes the other RegionServers to freeze= within a minute which brings down the entire cluster. Stopping the affecte= d RegionServer unblocks the cluster and everything comes back to normal. > {quote} > Subsequent troubleshooting reveals that RPC is getting stuck because we a= re losing RPC handlers. In the .out files we have this: > {noformat} > Exception in thread "defaultRpcServer.handler=3D5,queue=3D2,port=3D60020" > java.lang.StackOverflowError > at org.apache.hadoop.hbase.CellUtil$1.advance(CellUtil.java:210) > at org.apache.hadoop.hbase.CellUtil$1.advance(CellUtil.java:210) > at org.apache.hadoop.hbase.CellUtil$1.advance(CellUtil.java:210) > at org.apache.hadoop.hbase.CellUtil$1.advance(CellUtil.java:210) > [...] > Exception in thread "defaultRpcServer.handler=3D5,queue=3D2,port=3D60020" > java.lang.StackOverflowError > Exception in thread "defaultRpcServer.handler=3D18,queue=3D0,port=3D60020= " > java.lang.StackOverflowError > Exception in thread "defaultRpcServer.handler=3D23,queue=3D2,port=3D60020= " > java.lang.StackOverflowError > Exception in thread "defaultRpcServer.handler=3D24,queue=3D0,port=3D60020= " > java.lang.StackOverflowError > Exception in thread "defaultRpcServer.handler=3D2,queue=3D2,port=3D60020" > java.lang.StackOverflowError > Exception in thread "defaultRpcServer.handler=3D11,queue=3D2,port=3D60020= " > java.lang.StackOverflowError > Exception in thread "defaultRpcServer.handler=3D25,queue=3D1,port=3D60020= " > java.lang.StackOverflowError > Exception in thread "defaultRpcServer.handler=3D20,queue=3D2,port=3D60020= " > java.lang.StackOverflowError > Exception in thread "defaultRpcServer.handler=3D19,queue=3D1,port=3D60020= " > java.lang.StackOverflowError > Exception in thread "defaultRpcServer.handler=3D15,queue=3D0,port=3D60020= " > java.lang.StackOverflowError > Exception in thread "defaultRpcServer.handler=3D1,queue=3D1,port=3D60020" > java.lang.StackOverflowError > Exception in thread "defaultRpcServer.handler=3D7,queue=3D1,port=3D60020" > java.lang.StackOverflowError > Exception in thread "defaultRpcServer.handler=3D4,queue=3D1,port=3D60020" > java.lang.StackOverflowError=E2=80=8B > {noformat} > That is the anonymous CellScanner instance we create from CellUtil#create= CellScanner: > {code} > =E2=80=8B return new CellScanner() { > private final Iterator iterator =3D cellSc= annerables.iterator(); > private CellScanner cellScanner =3D null; > @Override > public Cell current() { > return this.cellScanner !=3D null? this.cellScanner.current(): nu= ll; > } > @Override > public boolean advance() throws IOException { > if (this.cellScanner =3D=3D null) { > if (!this.iterator.hasNext()) return false; > this.cellScanner =3D this.iterator.next().cellScanner(); > } > if (this.cellScanner.advance()) return true; > this.cellScanner =3D null; > ---> return advance(); > } > }; > {code} > That final return statement is the immediate problem. > We should also fix this so the RegionServer aborts if it loses a handler = to an Error.=20 -- This message was sent by Atlassian JIRA (v6.2#6252)