hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaurav Agarwal <gau...@arkin.net>
Subject Re: wal.FSHLog: Error syncing, request close of wal (regionserver crashes)
Date Fri, 30 Oct 2015 05:38:49 GMT
Hi, just a bump on this post to check if anyone knows more about this...

On Mon, Oct 26, 2015 at 11:06 PM, Gaurav Agarwal <gaurav@arkin.net> wrote:

> Hi All,
>
> We are running hbase -  *Version 1.0.0-cdh5.4.2, rUnknown, Tue May 19
> 17:04:41 PDT 2015,* and are facing the problem in the bug (
> https://issues.apache.org/jira/browse/HBASE-12074), where the
> regionserver crashes due to concurrent roll of wal file.
>
> Below are failure logs from one of the instance in our env:
>
> 2015-10-25 22:09:41,885 INFO  [regionserver/localhost/127.0.0.1:60020.logRoller]
> wal.FSHLog: Rolled WAL
> /var/lib/hbase/data/WALs/localhost,60020,1445796437179/localhost%2C60020%2C1445796437179.default.1445810949648
> with entries=11826, filesize=30.40 MB; new WAL
> /var/lib/hbase/data/WALs/localhost,60020,1445796437179/localhost%2C60020%2C1445796437179.default.1445810981882
> 2015-10-25 22:10:09,177 INFO  [regionserver/localhost/127.0.0.1:60020.logRoller]
> wal.FSHLog: Rolled WAL
> /var/lib/hbase/data/WALs/localhost,60020,1445796437179/localhost%2C60020%2C1445796437179.default.1445810981882
> with entries=7796, filesize=30.41 MB; new WAL
> /var/lib/hbase/data/WALs/localhost,60020,1445796437179/localhost%2C60020%2C1445796437179.default.1445811009174
> 2015-10-25 22:10:09,189 ERROR [sync.2] wal.FSHLog: Error syncing, request
> close of wal
> java.io.IOException: java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:176)
> at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1334)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:173)
> ... 2 more
> 2015-10-25 22:10:09,226 FATAL [regionserver/localhost/127.0.0.1:60020.logRoller]
> regionserver.HRegionServer: ABORTING region server
> localhost,60020,1445796437179: IOE in log roller
> java.io.IOException: java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:176)
> at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1334)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:173)
> ... 2 more
> 2015-10-25 22:10:09,226 FATAL [regionserver/localhost/127.0.0.1:60020.logRoller]
> regionserver.HRegionServer: RegionServer abort: loaded coprocessors are:
> [org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint]
>
> Does anyone know if there is some workaround this problem or if there is a
> patch for this?
> If there is no workarounds/patch, I can help create a patch but would need
> some general guidance on what could be going on here.
>
> --cheers, gaurav
>



-- 
--cheers, gaurav

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message