hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Latham (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4107) OOME while writing WAL checksum causes corrupt WAL
Date Fri, 26 Aug 2011 17:41:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091896#comment-13091896
] 

Dave Latham commented on HBASE-4107:
------------------------------------

It looks like HLog main has support to invoke --split.  Does it looks like if I call that
on the log that it will put split it and put the data into the right place?

We had a handful of regionservers go OOM yesterday while a MR job was doing heavy writes to
a column family that doesn't usually get them.  In this case, the first OOM occurred here
during writing the checksum.

> OOME while writing WAL checksum causes corrupt WAL
> --------------------------------------------------
>
>                 Key: HBASE-4107
>                 URL: https://issues.apache.org/jira/browse/HBASE-4107
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver, wal
>    Affects Versions: 0.90.1
>         Environment: CentOS 5.5x64
>            Reporter: Andy Sautins
>         Attachments: master.splitting.log, regionserver.oom.log
>
>
> An issue was observed where upon shutdown of a regionserver the regionserver log was
corrupt.  It appears from the following stacktrace that an Java heap memory exception occurred
while writing the checksum to the WAL.  Corrupting the WAL can potentially cause data loss.

> 2011-07-14 14:54:53,741 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not
append. Requesting close of hlog
> java.io.IOException: Reflection
>         at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:147)
>         at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:987)
>         at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:964)
> Caused by: java.lang.reflect.InvocationTargetException
>         at sun.reflect.GeneratedMethodAccessor1336.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:145)
>         ... 2 more
> Caused by: java.lang.OutOfMemoryError: Java heap space
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$Packet.<init>(DFSClient.java:2375)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:3271)
>         at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150)
>         at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3354)
>         at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
>         at org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:944)
>         ... 6 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message