hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebastian Cotarla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14392) [tests] TestLogRollingNoCluster fails on master from time to time
Date Wed, 07 Oct 2015 14:10:26 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946897#comment-14946897

Sebastian Cotarla commented on HBASE-14392:

I would like to mention that I've seen this issue in the organization I work for. We use a
stand-alone HBase instance (v1.0.1.1) which stores OpenTSDB data with a TTL of 5 minutes (used
for monitoring purposes). The NPE forced an abort on our Region Server:

2015-10-05 23:38:04,104 FATAL [regionserver/abcd/] regionserver.HRegionServer:
ABORTING region server abcd,39501,1443424863100: IOE in log roller
java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:176)
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1305)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException

Cheers for the fix and I'm keen on testing it in 1.2.0 release.

> [tests] TestLogRollingNoCluster fails on master from time to time
> -----------------------------------------------------------------
>                 Key: HBASE-14392
>                 URL: https://issues.apache.org/jira/browse/HBASE-14392
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>            Reporter: stack
>            Assignee: stack
>             Fix For: 2.0.0, 1.2.0, 1.3.0
>         Attachments: 14392.txt, 14392.txt, 14392v2.txt, 14392v2.txt, 14392v2.txt
> TestLogRollingNoCluster fails from time to time on a rig I have running here.
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support
was removed in 8.0
> Running org.apache.hadoop.hbase.regionserver.wal.TestLogRollingNoCluster
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.155 sec - in org.apache.hadoop.hbase.regionserver.wal.TestLogRollingNoCluster
> Results :
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
> The test itself is a bit odd. We apparently have seen NPEs around close of a WAL while
trying to sync... which I suppose makes sense if two threads involved.
> Attached patch just fails the sync silently if stream is null... lets presume it a close.
Adds a sync on write of trailer too... 
> This patch seems to have gotten rid of the odd failure seen on a particular box here
if I keep cycling the test.

This message was sent by Atlassian JIRA

View raw message