hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Created: (HBASE-1040) OOME does not cause graceful shutdown under some failure scenarios
Date Tue, 02 Dec 2008 02:01:47 GMT
OOME does not cause graceful shutdown under some failure scenarios

                 Key: HBASE-1040
                 URL: https://issues.apache.org/jira/browse/HBASE-1040
             Project: Hadoop HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.18.1
            Reporter: Andrew Purtell

Probably OOME related updates to trunk should be backported to 0.18 branch. I am seeing these
exceptions on our cluster:

> java.io.IOException: java.lang.OutOfMemoryError: Java heap space
> at java.io.DataInputStream.readFull(DataInputSteram.java:175)
> at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:64)
> at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:102)
> at org.apahce.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1933)
> at org.apahce.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1833)
> at org.apahce.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
> at org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:516)
> at org.apache.hadoop.hbase.regionserver.StoreFileScanner.getNext(StoreFileScanner.java:312)

When such OOMEs as above happen, the cluster does not recover without manual intervention.
The regionservers sometimes go down after this, or sometimes do not and stay up in sick condition
for a while. Regions go offline and remain unavailable.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message