Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 09D7B200BB7 for ; Wed, 9 Nov 2016 20:33:12 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 08799160AFA; Wed, 9 Nov 2016 19:33:12 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 26E4A160AEB for ; Wed, 9 Nov 2016 20:33:11 +0100 (CET) Received: (qmail 88947 invoked by uid 500); 9 Nov 2016 19:33:10 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 88931 invoked by uid 99); 9 Nov 2016 19:33:09 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Nov 2016 19:33:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id AA275189B62 for ; Wed, 9 Nov 2016 19:33:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id bNNlT_zbqDQH for ; Wed, 9 Nov 2016 19:33:06 +0000 (UTC) Received: from mail-yw0-f175.google.com (mail-yw0-f175.google.com [209.85.161.175]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 14B535FC5F for ; Wed, 9 Nov 2016 19:33:06 +0000 (UTC) Received: by mail-yw0-f175.google.com with SMTP id t125so218250797ywc.1 for ; Wed, 09 Nov 2016 11:33:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=zlqfdiRWAJfFJiK5HW1KeIyc/sCrDssE5tTpQyJYaPk=; b=O3u1leJ+8AZzirhm5rBuhhlFjoGpMsNiObWhn2G8PHfrdmqVgsfwSWF4Y/6ncg6dh4 HL0inTIxhGdVMVBaNgR9HBzBFI0ZmqkYJIaIbSyJ31QNT+x16oMJ+a73/94y3A4MVzWm 4kEyJ4C7UMSSCTu/wm5An7+OJQSXyomW0Cv+kjNU3V1zQInswjymKgja9FT5LSltZ3Bp 0eu+sPAjmAuJaYqH/ZYMboGEUnV4q5Vj3KK44dIce/Zbw1pzVB/S/zWbft/01pFGLnwL KRF7N+n0aTMSoNOFcQ74/g4Wo2hqNCFLbZog2DxMcr0ChNCa+pz/xbSbzAG67a58Vacj ueYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=zlqfdiRWAJfFJiK5HW1KeIyc/sCrDssE5tTpQyJYaPk=; b=MccJ9xXjPwODoRJ4OT/q+UfCIGm5iE6D/qUW70JRDFnbka5fsha6haIFe3q8v9ljxx onEDXfDLb7BffNbdh0BP4srR4EJ9IzY2NZVY4m1jqXJyv35DD/0DYAKkikD00Lli6fHU QF4+7EjF3QqoeCTc1WkH0YNaqswamg6/XFHU3pIHfAAWmujFv0mrG5y5v3CFc7UTfN/F pTBB/jg0rgSitLzfhFbpT0o6t1+YqUPGxNyrbCWywAnIFvziha3BCN5OvyQDod1lhUj4 wC/OW5TxnasCI80EhHR7naRu7WkvQ96/ENbqEYeLpesLmXhlqzHXjw54SDQ+m7XIhxp1 WCHg== X-Gm-Message-State: ABUngvctZHVhpEHB9alK5Kmh0r9kAHwEm8hHaEmtv25K5xG/K4rq1FayshrBLryKTpfMnJcCK2gnVUa3JhnaNg== X-Received: by 10.129.108.13 with SMTP id h13mr1301123ywc.101.1478719865441; Wed, 09 Nov 2016 11:31:05 -0800 (PST) MIME-Version: 1.0 Received: by 10.37.15.7 with HTTP; Wed, 9 Nov 2016 11:31:04 -0800 (PST) In-Reply-To: References: From: Ted Yu Date: Wed, 9 Nov 2016 11:31:04 -0800 Message-ID: Subject: Re: Replication Issue, Attempting to flush snapshot with id = -1 To: "dev@hbase.apache.org" Content-Type: multipart/alternative; boundary=001a114bce684d53b30540e34cfd archived-at: Wed, 09 Nov 2016 19:33:12 -0000 --001a114bce684d53b30540e34cfd Content-Type: text/plain; charset=UTF-8 Can you take a look at HBASE-16270 ? I did a brief search for 'UnexpectedStateException: Current snapshot id' which ended up with the above JIRA. See if it applies to your case. Cheers On Wed, Nov 9, 2016 at 10:42 AM, Timothy Brown wrote: > Regarding the config I was referring to "*hbase.replication* (Default: > false) - Controls whether replication is enabled or disabled for the > cluster." (from https://hbase.apache.org/0.94/replication.html) > > Unfortunately the issue happened over night and the exception gets thrown > multiple times per second. Here's more of the logs for reference though > http://pastebin.com/7KxZTrmf > > On Wed, Nov 9, 2016 at 10:31 AM, Ted Yu wrote: > > > bq. hbase.replication > > > > Not sure which config you were referring to above. > > > > Can you pastebin more of the region server log around the time exception > > happened ? > > > > Thanks > > > > On Wed, Nov 9, 2016 at 10:24 AM, Timothy Brown > > wrote: > > > > > Hi, > > > > > > I'm currently trying to enable High Availability for my HBase cluster. > > > I'm using HBase version 1.2.0 provided by Cloudera's cdh5.8.0. > > > Everything works for a couple hours and then replication stops due to > > > the exception pasted below. We see sizeOfLogQueue continue to grow > > > every few minutes. Has anyone else run into this or know how we may > > > have gotten into this state? > > > > > > > > > Non Default Configs set: > > > > > > hbase.region.replica.replication.enabled > > > > > > hbase.replication > > > > > > > > > Exception seen: > > > > > > Wed Nov 09 00:43:27 UTC 2016, > > > RpcRetryingCaller{globalStartTime=1478652206658, pause=100, > > > retries=35}, org.apache.hadoop.hbase.regionserver. > > > UnexpectedStateException: > > > org.apache.hadoop.hbase.regionserver.UnexpectedStateException: Current > > > snapshot id is -1,passed 1478639480535 > > > at org.apache.hadoop.hbase.regionserver.DefaultMemStore. > > > clearSnapshot(DefaultMemStore.java:191) > > > at org.apache.hadoop.hbase.regionserver.HStore. > > > updateStorefiles(HStore.java:1082) > > > at org.apache.hadoop.hbase.regionserver.HStore.access$ > > > 600(HStore.java:119) > > > at org.apache.hadoop.hbase.regionserver.HStore$ > > > StoreFlusherImpl.replayFlush(HStore.java:2377) > > > at org.apache.hadoop.hbase.regionserver.HRegion. > > > replayFlushInStores(HRegion.java:4565) > > > at org.apache.hadoop.hbase.regionserver.HRegion. > > > replayWALFlushCommitMarker(HRegion.java:4471) > > > at org.apache.hadoop.hbase.regionserver.HRegion. > > > replayWALFlushMarker(HRegion.java:4272) > > > at org.apache.hadoop.hbase.regionserver.RSRpcServices. > > > doReplayBatchOp(RSRpcServices.java:835) > > > at org.apache.hadoop.hbase.regionserver.RSRpcServices. > > > replay(RSRpcServices.java:1765) > > > at org.apache.hadoop.hbase.protobuf.generated. > > > AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22255) > > > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java: > > 2170) > > > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner. > > java:109) > > > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop( > > > RpcExecutor.java:133) > > > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor. > > > java:108) > > > at java.lang.Thread.run(Thread.java:745) > > > > > > > > > Thanks, > > > > > > Tim > > > > > > --001a114bce684d53b30540e34cfd--