hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Heng Chen <heng.chen.1...@gmail.com>
Subject Re: Region server getting aborted in every one or two days
Date Wed, 23 Mar 2016 10:01:37 GMT
Is your DN with slow response at that time?

2016-03-23 15:50 GMT+08:00 Anoop John <anoop.hbase@gmail.com>:

> At the same time, any explicit close op happened on the WAL file?  Any
> log rolling?  Can u check the logs to know this?  May be check HDFS
> logs to know abt the close calls to WAL file?
>
> -Anoop-
>
> On Wed, Mar 23, 2016 at 12:10 PM, Pankaj kr <pankaj.kr@huawei.com> wrote:
> > Hi,
> >
> > In our production environment, RS is getting aborted in every one or two
> days with following exception.
> >
> > 2016-03-16 13:57:07,975 | FATAL | MemStoreFlusher.0 | ABORTING region
> server xyz-vm8,24502,1458034278600: Replay of WAL required. Forcing server
> shutdown |
> org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:2055)
> > org.apache.hadoop.hbase.DroppedSnapshotException: region:
> TB_WEBLOGIN_201603,060,1457916997964.06e204d3bc262b72820aa195fec23513.
> >                 at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2423)
> >                 at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2128)
> >                 at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2090)
> >                 at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1983)
> >                 at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1909)
> >                 at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:509)
> >                 at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:470)
> >                 at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:74)
> >                 at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> >                 at java.lang.Thread.run(Thread.java:745)
> > Caused by: java.nio.channels.ClosedChannelException
> >               at
> org.apache.hadoop.hdfs.DataStreamer$LastExceptionInStreamer.throwException4Close(DataStreamer.java:208)
> >                 at
> org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:142)
> >                 at
> org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:635)
> >                 at
> org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:490)
> >                 at
> org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
> >                 at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:190)
> >                 at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1342)
> >                 ... 1 more
> >
> > I don't see any error info at HDFS side at that point of time.
> > Have anyone faced this issue?
> >
> > HBase version is 0.98.6.
> >
> > Regards,
> > Pankaj
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message