hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-1042) OOME but we don't abort
Date Wed, 03 Dec 2008 00:02:44 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-1042:
-------------------------

    Attachment: 1042.patch

Patch that catches Throwable in Leases and exits cleaning up existing leases.  Also, added
to checkOOME test for the special OOME that comes up out of a mapfile (the OOME is in the
IOE message, not as the cause of the IOE).... see below for example ... so we abort rather
than as now, miss the OOME.

{code}
2008-12-02 12:58:51,274 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 60020,
call openScanner([B@630ff939, [[B@2db33ffe, [B@532e5422, 9223372036854775807, null) from XX.XX.XX.106:55041:
error: java.io.IOException: java.lang.OutOfMemoryError: Java heap space
java.io.IOException: java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.io.MapFile$Reader.readIndex(MapFile.java:337)
        at org.apache.hadoop.io.MapFile$Reader.midKey(MapFile.java:368)
        at org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.<init>(HBaseMapFile.java:93)
        at org.apache.hadoop.hbase.io.BloomFilterMapFile$Reader.<init>(BloomFilterMapFile.java:66)
        at org.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java:443)
        at org.apache.hadoop.hbase.regionserver.StoreFileScanner.openReaders(StoreFileScanner.java:96)
        at org.apache.hadoop.hbase.regionserver.StoreFileScanner.<init>(StoreFileScanner.java:67)
        at org.apache.hadoop.hbase.regionserver.HStoreScanner.<init>(HStoreScanner.java:84)
        at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2095)
        at org.apache.hadoop.hbase.regionserver.HRegion$HScanner.<init>(HRegion.java:1977)
        at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1159)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1441)
        at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:634)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
{code}

> OOME but we don't abort
> -----------------------
>
>                 Key: HBASE-1042
>                 URL: https://issues.apache.org/jira/browse/HBASE-1042
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.19.0
>
>         Attachments: 1042.patch
>
>
> On streamy cluster saw case where graceful shutdown had been triggered rather than an
abort on OOME.  On graceful shutdown, we wait on leases to expire or be closed.  Server wouldn't
go down because it was waiting on leases to expire only an OOME in Leases had killed the thread
so it wasn't ever going to expire anything.   Node was stuck for four hours till someone noticed
it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message