hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-20322) CME in StoreScanner causes region server crash
Date Tue, 03 Apr 2018 19:02:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-20322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424449#comment-16424449
] 

Hudson commented on HBASE-20322:
--------------------------------

SUCCESS: Integrated in Jenkins build HBase-1.3-IT #387 (See [https://builds.apache.org/job/HBase-1.3-IT/387/])
HBASE-20322 CME in StoreScanner causes region server crash (apurtell: rev 9eaafe1ee91bc12d8933e17ad6cab7af8251c0f5)
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java


> CME in StoreScanner causes region server crash
> ----------------------------------------------
>
>                 Key: HBASE-20322
>                 URL: https://issues.apache.org/jira/browse/HBASE-20322
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.3.2
>            Reporter: Thiruvel Thirumoolan
>            Assignee: Thiruvel Thirumoolan
>            Priority: Major
>             Fix For: 1.5.0, 1.3.3, 1.4.4
>
>         Attachments: HBASE-20322.branch-1.3.001.patch, HBASE-20322.branch-1.3.002.patch,
HBASE-20322.branch-1.4.001.patch
>
>
> RS crashed with ConcurrentModificationException on our 1.3 cluster, stack trace below.
[~toffer] and I checked and there is a race condition between flush and scanner close. When
StoreScanner.updateReaders() is updating the scanners after a newly flushed file (in this trace
below a region close during a split), the client's scanner could be closing thus causing
CME.
> Its rare, but since it crashes the region server, needs to be fixed.
> FATAL regionserver.HRegionServer [regionserver/<rs>] : ABORTING region server <rs>:
Replay of WAL required. Forcing server shutdown
> org.apache.hadoop.hbase.DroppedSnapshotException: region: <regionname>
> at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2579)
> at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2255)
> at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2217)
> at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2207)
> at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1501)
> at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1420)
> at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:398)
> at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278)
> at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:566)
> at org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82)
> at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.ConcurrentModificationException
> at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
> at java.util.ArrayList$Itr.next(ArrayList.java:851)
> at org.apache.hadoop.hbase.regionserver.StoreScanner.clearAndClose(StoreScanner.java:797)
> at org.apache.hadoop.hbase.regionserver.StoreScanner.updateReaders(StoreScanner.java:825)
> at org.apache.hadoop.hbase.regionserver.HStore.notifyChangedReadersObservers(HStore.java:1155)
> PS: ignore the line no in the above stack trace, method calls should help understand
whats happening.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message