Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A139E18A67 for ; Thu, 23 Jul 2015 07:05:21 +0000 (UTC) Received: (qmail 89798 invoked by uid 500); 23 Jul 2015 07:04:57 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 89716 invoked by uid 500); 23 Jul 2015 07:04:57 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 89633 invoked by uid 99); 23 Jul 2015 07:04:57 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jul 2015 07:04:57 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id EF070184F89 for ; Thu, 23 Jul 2015 07:04:56 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.879 X-Spam-Level: ** X-Spam-Status: No, score=2.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Sy4tjyyJQjaF for ; Thu, 23 Jul 2015 07:04:55 +0000 (UTC) Received: from mail-ob0-f169.google.com (mail-ob0-f169.google.com [209.85.214.169]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 0DDE243DF3 for ; Thu, 23 Jul 2015 07:04:55 +0000 (UTC) Received: by obnw1 with SMTP id w1so148366303obn.3 for ; Thu, 23 Jul 2015 00:04:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=KLhj5SsG1j3H0maycrQDyt1fpcMf0vQMG66qyfPA8VQ=; b=dtvnqu9wRxEsGLGWDGPL1yJAmNM0NREoBBPz/pFdHn97FBVCesAZ41wHi+sWsbsunO /j2pHNpde/isAtZTp/nzhWG+J4Wf1DRe8zdVps/iFQSD1g29JmtprWkYJrGD/VUFZoMx o3Nr/puTzgrpmLTSPYXoy3w0Iujt3EiOY+GBPXNpZ5+bboXsI7fdzTOGp5xhHcfpplqJ LH9TBV/wofFel/ABuNJZhGTdoALlIcpARdfj5rGMhGiuCTAGt+GdM2eijvJKfrRG0DQK 4Ls2Sf2kwXI9RR/HvBVklpjQk1Ut1gzFHHz7jsFMgyJYILnPjohvRK6jHWUGjTDSzc1p ZMFA== MIME-Version: 1.0 X-Received: by 10.182.133.3 with SMTP id oy3mr6635028obb.86.1437635049629; Thu, 23 Jul 2015 00:04:09 -0700 (PDT) Received: by 10.202.71.75 with HTTP; Thu, 23 Jul 2015 00:04:09 -0700 (PDT) In-Reply-To: References: Date: Thu, 23 Jul 2015 15:04:09 +0800 Message-ID: Subject: Re: Dead loop for batch put when get WrongRegionException From: Louis Hust To: Victor Xu Cc: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=e89a8ff1cf32732efa051b857e63 --e89a8ff1cf32732efa051b857e63 Content-Type: text/plain; charset=UTF-8 My hbase version is 0.98.6, i will try update client to 0.98.14 and keep server at 0.98.6, Thanks very much! 2015-07-23 15:00 GMT+08:00 Victor Xu : > Client-side dead loop can cause sending wrong read/write requests to the > region servers, and you've got exactly the same log output as myself when > the the bug happens. However, this only occurs when you are using 0.98.X > version. 1.0 and above do not have this problem. > > On Thu, Jul 23, 2015 at 2:55 PM Louis Hust wrote: > >> It seems that the HBASE-13896 >> is client-side dead >> loop, >> but my problem is the regionserver-side dead lock for get row lock, >> >> 2015-07-23 11:23 GMT+08:00 Victor Xu : >> >>> FYI >>> >>> ---------- Forwarded message --------- >>> From: Victor Xu >>> Date: Thu, Jul 23, 2015 at 11:22 AM >>> Subject: Re: Dead loop for batch put when get WrongRegionException >>> To: user@hbase.apache.org >>> >>> >>> Any chance that this would be your problem? >>> https://issues.apache.org/jira/browse/HBASE-13896 >>> >>> On Thu, Jul 23, 2015 at 11:17 AM Louis Hust >>> wrote: >>> >>>> Hi ,all >>>> >>>> We are using batch put to insert rows, and sometimes get the following >>>> WARN >>>> in the region server log: >>>> >>>> >>>> 2015-07-23 10:08:49,684 WARN >>>> [B.defaultRpcServer.handler=5,queue=5,port=60020] regionserver.HRegion: >>>> Failed getting lock in batch put, row=BHXYHZFIHHR3ECON101002150723999999 >>>> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row >>>> out of range for row lock on HRegion >>>> atpco:ttf_fare,C,1437145538123.9c2b8cb846b318045f2ad6b5c87fef21., >>>> startKey='C', getEndKey()='D', row='BHXYHZFIHHR3ECON101002150723999999' >>>> at >>>> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3456) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3474) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2217) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4386) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3588) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3477) >>>> at >>>> >>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29593) >>>> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031) >>>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) >>>> at >>>> >>>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) >>>> >>>> >>>> And the WARN message is logged no-stop. I think the batch put dived into >>>> the dead loop. >>>> >>>> And i look up into the source code, and find the batch put will never >>>> stop >>>> if got WrongRegionException for some row. >>>> >>>> Any body know how to avoid this situation? >>>> >>>> Any idea will be appreciated! >>>> >>> >> --e89a8ff1cf32732efa051b857e63--