hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shrijeet <shrij...@pinterest.com.INVALID>
Subject HLog's AsyncHLogWriter aborted abruptly
Date Tue, 22 Sep 2015 00:58:40 GMT
HBase Version: 0.94.26
HDFS version: 2.5.x

We backported HBASE-8755 onto 0.94.27 and seeing a corner case that I wish
to run by the list. In one of the write heavy use cases we noticed region
server hanging forever (all server handlers busy on HLog.sync plus few
other odd things). From logs we could see we had hit what looked like
https://issues.apache.org/jira/browse/HDFS-7765

*15/09/04 19:55:32 INFO wal.HLog: AsyncHLogWriter exiting*
Exception in thread "AsyncHLogWriter" 15/09/04 19:55:32 INFO
regionserver.StoreFile$Reader: Loaded ROW (CompoundBloomFilter) metadata
for e10b96d0b1b94675b9bbe60b9ce8e220
15/09/04 19:55:32 INFO regionserver.Store: Added
hdfs://localhost/hbase/table/a032227ad1500eec1d0ed108c52bc31c/t/e10b96d0b1b94675b9bbe60b9ce8e220,
entries=609619, sequenceid=8449834686, filesize=4.5 M
*java.lang.ArrayIndexOutOfBoundsException: 4608*
*>-at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:76)*
>-at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:50)
>-at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
>-at org.apache.hadoop.io.SequenceFile$Writer.sync(SequenceFile.java:1229)
>-at
org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1290)
>-at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1330)
>-at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1297)
>-at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:284)
>-at
org.apache.hadoop.hbase.regionserver.wal.HLog$AsyncWriter.run(HLog.java:1303)
>-at java.lang.Thread.run(Thread.java:745)

Since AsyncWriter got aborted due to an unhanded exception the AsyncSyncer
etc. will not be notified, thus deadlocking the server. I am not familiar
with HLog's life cycle, how does it handles errors in its worker threads? I
see some IOE handling but looks like its a deferred action. Fair to assume
we can't do same (as IOE) for all Throwables?

This is how we handle IOE in AsyncWriter:

// 3. write all buffered writes to HDFS(append, without sync)
try {
  for (Entry e : pendingWrites) {
    writer.append(e);
  }
} catch(IOException e) {
  LOG.fatal("Error while AsyncWriter write, request close of hlog ", e);
  requestLogRoll();
*  asyncIOE = e;*
  failedTxid.set(this.lastWrittenTxid);
}

// 4. update 'lastWrittenTxid' and notify AsyncSyncer to do 'sync'
asyncSyncer.setWrittenTxid(this.lastWrittenTxid);

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message