hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Replication Issue, Attempting to flush snapshot with id = -1
Date Wed, 09 Nov 2016 19:31:04 GMT
Can you take a look at HBASE-16270 ?

I did a brief search for 'UnexpectedStateException: Current snapshot id'
which ended up with the above JIRA.

See if it applies to your case.

Cheers

On Wed, Nov 9, 2016 at 10:42 AM, Timothy Brown <tim@siftscience.com> wrote:

> Regarding the config I was referring to "*hbase.replication* (Default:
> false) - Controls whether replication is enabled or disabled for the
> cluster." (from https://hbase.apache.org/0.94/replication.html)
>
> Unfortunately the issue happened over night and the exception gets thrown
> multiple times per second. Here's more of the logs for reference though
> http://pastebin.com/7KxZTrmf
>
> On Wed, Nov 9, 2016 at 10:31 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > bq. hbase.replication
> >
> > Not sure which config you were referring to above.
> >
> > Can you pastebin more of the region server log around the time exception
> > happened ?
> >
> > Thanks
> >
> > On Wed, Nov 9, 2016 at 10:24 AM, Timothy Brown <tim@siftscience.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I'm currently trying to enable High Availability for my HBase cluster.
> > > I'm using HBase version 1.2.0 provided by Cloudera's cdh5.8.0.
> > > Everything works for a couple hours and then replication stops due to
> > > the exception pasted below. We see sizeOfLogQueue continue to grow
> > > every few minutes. Has anyone else run into this or know how we may
> > > have gotten into this state?
> > >
> > >
> > > Non Default Configs set:
> > >
> > > hbase.region.replica.replication.enabled
> > >
> > > hbase.replication
> > >
> > >
> > > Exception seen:
> > >
> > > Wed Nov 09 00:43:27 UTC 2016,
> > > RpcRetryingCaller{globalStartTime=1478652206658, pause=100,
> > > retries=35}, org.apache.hadoop.hbase.regionserver.
> > > UnexpectedStateException:
> > > org.apache.hadoop.hbase.regionserver.UnexpectedStateException: Current
> > > snapshot id is -1,passed 1478639480535
> > >         at org.apache.hadoop.hbase.regionserver.DefaultMemStore.
> > > clearSnapshot(DefaultMemStore.java:191)
> > >         at org.apache.hadoop.hbase.regionserver.HStore.
> > > updateStorefiles(HStore.java:1082)
> > >         at org.apache.hadoop.hbase.regionserver.HStore.access$
> > > 600(HStore.java:119)
> > >         at org.apache.hadoop.hbase.regionserver.HStore$
> > > StoreFlusherImpl.replayFlush(HStore.java:2377)
> > >         at org.apache.hadoop.hbase.regionserver.HRegion.
> > > replayFlushInStores(HRegion.java:4565)
> > >         at org.apache.hadoop.hbase.regionserver.HRegion.
> > > replayWALFlushCommitMarker(HRegion.java:4471)
> > >         at org.apache.hadoop.hbase.regionserver.HRegion.
> > > replayWALFlushMarker(HRegion.java:4272)
> > >         at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> > > doReplayBatchOp(RSRpcServices.java:835)
> > >         at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> > > replay(RSRpcServices.java:1765)
> > >         at org.apache.hadoop.hbase.protobuf.generated.
> > > AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22255)
> > >         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:
> > 2170)
> > >         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.
> > java:109)
> > >         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> > > RpcExecutor.java:133)
> > >         at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> > > java:108)
> > >         at java.lang.Thread.run(Thread.java:745)
> > >
> > >
> > > Thanks,
> > >
> > > Tim
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message