Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 47C64200BE5 for ; Sat, 10 Dec 2016 04:43:01 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 46684160B1E; Sat, 10 Dec 2016 03:43:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6A2EC160B1D for ; Sat, 10 Dec 2016 04:43:00 +0100 (CET) Received: (qmail 1819 invoked by uid 500); 10 Dec 2016 03:42:59 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 1783 invoked by uid 99); 10 Dec 2016 03:42:59 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 10 Dec 2016 03:42:59 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 34C0E2C03DE for ; Sat, 10 Dec 2016 03:42:59 +0000 (UTC) Date: Sat, 10 Dec 2016 03:42:59 +0000 (UTC) From: "Hudson (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-17276) Reduce log spam from WrongRegionException in large multi()'s MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sat, 10 Dec 2016 03:43:01 -0000 [ https://issues.apache.org/jira/browse/HBASE-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15737132#comment-15737132 ] Hudson commented on HBASE-17276: -------------------------------- FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #2104 (See [https://builds.apache.org/job/HBase-Trunk_matrix/2104/]) HBASE-17276 Only log stacktraces for exceptions once for updates in a (stack: rev b554e054109039bdf92b103243b5f862a0e49cfd) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestObservedExceptionsInBatch.java > Reduce log spam from WrongRegionException in large multi()'s > ------------------------------------------------------------ > > Key: HBASE-17276 > URL: https://issues.apache.org/jira/browse/HBASE-17276 > Project: HBase > Issue Type: Improvement > Components: regionserver > Reporter: Josh Elser > Assignee: Josh Elser > Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-17276.001.patch, HBASE-17276.002.patch > > > The following spam drives me up a wall in the regionserver log: > {noformat} > 2016-12-05 05:53:05,085 WARN [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16020] regionserver.HRegion: Batch mutation had a row that does not belong to this region > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for doMiniBatchMutation on HRegion IntegrationTestReplicationSinkRestart,L\xCC\xCC\xCC\xCC\xCC\xCC\xC8,1480916713541.caab3310166699287b54b72b35b29431., startKey='L\xCC\xCC\xCC\xCC\xCC\xCC\xC8', getEndKey()='Y\x99\x99\x99\x99\x99\x99\x94', row='\x0C\xD2\xA5\xA3\x99\xC7\xE0Q!\x15^\xA6\x90\x1E\xA3\xAD' > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:5211) > at org.apache.hadoop.hbase.regionserver.HRegion.checkAndPrepareMutation(HRegion.java:3879) > at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3040) > at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2933) > at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2875) > at org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:717) > at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:679) > at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2056) > at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32303) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2141) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167) > 2016-12-05 05:53:05,086 WARN [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16020] regionserver.HRegion: Batch mutation had a row that does not belong to this region > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for doMiniBatchMutation on HRegion IntegrationTestReplicationSinkRestart,L\xCC\xCC\xCC\xCC\xCC\xCC\xC8,1480916713541.caab3310166699287b54b72b35b29431., startKey='L\xCC\xCC\xCC\xCC\xCC\xCC\xC8', getEndKey()='Y\x99\x99\x99\x99\x99\x99\x94', row='\x0E\xE7\xFA[\x8D\x93;\xF4\xC7F\xF9\x85\x84\x85\xF3\x0E' > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:5211) > at org.apache.hadoop.hbase.regionserver.HRegion.checkAndPrepareMutation(HRegion.java:3879) > at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3040) > at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2933) > at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2875) > at org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:717) > at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:679) > at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2056) > at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32303) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2141) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167) > 2016-12-05 05:53:05,087 WARN [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16020] regionserver.HRegion: Batch mutation had a row that does not belong to this region > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for doMiniBatchMutation on HRegion IntegrationTestReplicationSinkRestart,L\xCC\xCC\xCC\xCC\xCC\xCC\xC8,1480916713541.caab3310166699287b54b72b35b29431., startKey='L\xCC\xCC\xCC\xCC\xCC\xCC\xC8', getEndKey()='Y\x99\x99\x99\x99\x99\x99\x94', row='\x16-\xFC\x99\xF5c\x08\xFA\x1D\x84\x86\xD2\x18\xB1\x03q' > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:5211) > at org.apache.hadoop.hbase.regionserver.HRegion.checkAndPrepareMutation(HRegion.java:3879) > at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3040) > at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2933) > at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2875) > at org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:717) > at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:679) > at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2056) > at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32303) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2141) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167) > {noformat} > With adequate replication traffic that is delayed or just slow, you can have a batch of 64MB of updates to a Region which are all on a different RegionServer by the time the RS processes it. > In a run of IntegrationTestReplication that is particularly "slow"/oversaturated, I saw 1.591M log lines taken up with this message out of a total number of line of 1.597M lines (99.6% of the log). I propose that after the first WrongRegionException we see in {{doMiniBatchMutation}}, we stop printing out the rest of the stacktrace (save on 13 lines for every occurrence). -- This message was sent by Atlassian JIRA (v6.3.4#6332)