Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C9C7610918 for ; Wed, 11 Mar 2015 11:07:36 +0000 (UTC) Received: (qmail 25616 invoked by uid 500); 11 Mar 2015 11:07:36 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 25533 invoked by uid 500); 11 Mar 2015 11:07:36 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 25519 invoked by uid 99); 11 Mar 2015 11:07:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Mar 2015 11:07:35 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ramkrishna.s.vasudevan@gmail.com designates 209.85.223.182 as permitted sender) Received: from [209.85.223.182] (HELO mail-ie0-f182.google.com) (209.85.223.182) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Mar 2015 11:07:30 +0000 Received: by ieclw3 with SMTP id lw3so15709367iec.2 for ; Wed, 11 Mar 2015 04:07:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=hDJaPpisKbVDhhMINllvLNFZfhFLMfU0CNNIMvb2424=; b=ESITn21hjZP5twmAa0GXJnBsxbAK81u0XkPuV5o1G874w3iAAvqxTMmsXvHhGB3FdE hhjKhsI/0tWAF4pRnKElVMu4ATVQuzziz7Vh33ykXZbCYXT8gQzysJw3mfdEUu7N32FI 7VOMPiGKKFWsGPV4nnKMKtMy52yKB4hzLW3qiC2Qt4yXqNcCeaTOcvmrHhyEe8rkhY/j 6kgMRu4v/MO1wyN0hdw9QwM8bOYyEXCW8OprgdIyARM0IuYuXKXr02j6gCiBi4MnWbLB a4La/1H8v1RELe6TcvQPBvvqGREN3UnmHYTaHOF93oRjQEKPQBb/em3vVx0CGNed5HHr zncw== MIME-Version: 1.0 X-Received: by 10.50.66.141 with SMTP id f13mr38737220igt.9.1426072029243; Wed, 11 Mar 2015 04:07:09 -0700 (PDT) Received: by 10.107.129.15 with HTTP; Wed, 11 Mar 2015 04:07:09 -0700 (PDT) Date: Wed, 11 Mar 2015 16:37:09 +0530 Message-ID: Subject: Trunk hangs after a stop/start of RegionServer From: ramkrishna vasudevan To: "dev@hbase.apache.org" Content-Type: multipart/alternative; boundary=047d7bdc0dd2ba53380511014440 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bdc0dd2ba53380511014440 Content-Type: text/plain; charset=UTF-8 Hi All The latest trunk hangs after we do a stop and start of the Region Server with the following error org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via stobdtserver3,16040,1426090566331:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: java.io.IOException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /hbase/flush-table-proc/acquired/TestTable/stobdtserver3,16040,1426090566331 at org.apache.hadoop.hbase.errorhandling.ForeignException.deserialize(ForeignException.java:171) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.abort(ZKProcedureMemberRpcs.java:329) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.watchForAbortedProcedures(ZKProcedureMemberRpcs.java:142) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.start(ZKProcedureMemberRpcs.java:352) at org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager.start(RegionServerFlushTableProcedureManager.java:102) at org.apache.hadoop.hbase.procedure.RegionServerProcedureManagerHost.start(RegionServerProcedureManagerHost.java:53) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:882) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: java.io.IOException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /hbase/flush-table-proc/acquired/TestTable/stobdtserver3,16040,1426090566331 at org.apache.hadoop.hbase.procedure.Subprocedure.cancel(Subprocedure.java:273) at org.apache.hadoop.hbase.procedure.ProcedureMember.controllerConnectionFailure(ProcedureMember.java:225) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.sendMemberAcquired(ZKProcedureMemberRpcs.java:254) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:166) at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:52) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) Even when we try to flush we get the above error. Because of this the system hangs and we are not able to proceed with performing operations particularly after we restart the region server. I have a single RS and single master installation for internal testing. Any hints on why this happens? It was not happening till the update that I had taken 3 days back. Regards Ram --047d7bdc0dd2ba53380511014440--