Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hbase.apache.org
MIME-Version: 1.0
In-Reply-To: <CALte62zdPOPZLpE2fuHp3NXJkzyBP5RyC7E6susrX4xLhCCOOg@mail.gmail.com>
References: <CAPVWZXCmdrMpx3EtnVuQCGoF-jAfV0QYZ=HKiQVn7h2MjhE_Xw@mail.gmail.com>
 <CALte62wQo2hmRf=qxK6jqY2Hqt-pEGFhoeaBmnPTqd0+hB6o8g@mail.gmail.com>
 <CAPVWZXDO+Hpu1B3vwCtSKpODsu2DSPCpf4fa06fesZ+XOf-2xQ@mail.gmail.com> <CALte62zdPOPZLpE2fuHp3NXJkzyBP5RyC7E6susrX4xLhCCOOg@mail.gmail.com>
From: =?UTF-8?Q?Enis_S=C3=B6ztutar?= <enis@apache.org>
Date: Wed, 9 Nov 2016 11:47:35 -0800
Message-ID: <CAMUu0w-4kUto3xO_NPLyJCBP4xxjHypyA6Ux6M_cB-m3ZPrbtg@mail.gmail.com>
Subject: Re: Replication Issue, Attempting to flush snapshot with id = -1
To: "dev@hbase.apache.org" <dev@hbase.apache.org>
Content-Type: multipart/alternative; boundary=001a114e021a855f960540e38808
archived-at: Wed, 09 Nov 2016 19:47:58 -0000

--001a114e021a855f960540e38808
Content-Type: text/plain; charset=UTF-8

Indeed this looks like HBASE-16270.

Enis

On Wed, Nov 9, 2016 at 11:31 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> Can you take a look at HBASE-16270 ?
>
> I did a brief search for 'UnexpectedStateException: Current snapshot id'
> which ended up with the above JIRA.
>
> See if it applies to your case.
>
> Cheers
>
> On Wed, Nov 9, 2016 at 10:42 AM, Timothy Brown <tim@siftscience.com>
> wrote:
>
> > Regarding the config I was referring to "*hbase.replication* (Default:
> > false) - Controls whether replication is enabled or disabled for the
> > cluster." (from https://hbase.apache.org/0.94/replication.html)
> >
> > Unfortunately the issue happened over night and the exception gets thrown
> > multiple times per second. Here's more of the logs for reference though
> > http://pastebin.com/7KxZTrmf
> >
> > On Wed, Nov 9, 2016 at 10:31 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > bq. hbase.replication
> > >
> > > Not sure which config you were referring to above.
> > >
> > > Can you pastebin more of the region server log around the time
> exception
> > > happened ?
> > >
> > > Thanks
> > >
> > > On Wed, Nov 9, 2016 at 10:24 AM, Timothy Brown <tim@siftscience.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm currently trying to enable High Availability for my HBase
> cluster.
> > > > I'm using HBase version 1.2.0 provided by Cloudera's cdh5.8.0.
> > > > Everything works for a couple hours and then replication stops due to
> > > > the exception pasted below. We see sizeOfLogQueue continue to grow
> > > > every few minutes. Has anyone else run into this or know how we may
> > > > have gotten into this state?
> > > >
> > > >
> > > > Non Default Configs set:
> > > >
> > > > hbase.region.replica.replication.enabled
> > > >
> > > > hbase.replication
> > > >
> > > >
> > > > Exception seen:
> > > >
> > > > Wed Nov 09 00:43:27 UTC 2016,
> > > > RpcRetryingCaller{globalStartTime=1478652206658, pause=100,
> > > > retries=35}, org.apache.hadoop.hbase.regionserver.
> > > > UnexpectedStateException:
> > > > org.apache.hadoop.hbase.regionserver.UnexpectedStateException:
> Current
> > > > snapshot id is -1,passed 1478639480535
> > > >         at org.apache.hadoop.hbase.regionserver.DefaultMemStore.
> > > > clearSnapshot(DefaultMemStore.java:191)
> > > >         at org.apache.hadoop.hbase.regionserver.HStore.
> > > > updateStorefiles(HStore.java:1082)
> > > >         at org.apache.hadoop.hbase.regionserver.HStore.access$
> > > > 600(HStore.java:119)
> > > >         at org.apache.hadoop.hbase.regionserver.HStore$
> > > > StoreFlusherImpl.replayFlush(HStore.java:2377)
> > > >         at org.apache.hadoop.hbase.regionserver.HRegion.
> > > > replayFlushInStores(HRegion.java:4565)
> > > >         at org.apache.hadoop.hbase.regionserver.HRegion.
> > > > replayWALFlushCommitMarker(HRegion.java:4471)
> > > >         at org.apache.hadoop.hbase.regionserver.HRegion.
> > > > replayWALFlushMarker(HRegion.java:4272)
> > > >         at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> > > > doReplayBatchOp(RSRpcServices.java:835)
> > > >         at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> > > > replay(RSRpcServices.java:1765)
> > > >         at org.apache.hadoop.hbase.protobuf.generated.
> > > > AdminProtos$AdminService$2.callBlockingMethod(
> AdminProtos.java:22255)
> > > >         at org.apache.hadoop.hbase.ipc.
> RpcServer.call(RpcServer.java:
> > > 2170)
> > > >         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.
> > > java:109)
> > > >         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> > > > RpcExecutor.java:133)
> > > >         at org.apache.hadoop.hbase.ipc.
> RpcExecutor$1.run(RpcExecutor.
> > > > java:108)
> > > >         at java.lang.Thread.run(Thread.java:745)
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Tim
> > > >
> > >
> >
>

--001a114e021a855f960540e38808--