hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: RegionServer goes down in Compact SplitThread
Date Thu, 08 Aug 2013 10:27:45 GMT
Hi Prasad,

0.90.6 is a pretty old HBase version, and so CDH3u5 is a pretty old CDH
version...

Any chance to move to a more recent version?

JM

2013/8/8 Prasad GS <gsp200183@gmail.com>

> Hi,
>
> We are using Cloudera CDH3u5 distribution of HBase (0.90.6). The RS goes
> down suddenly & from the logs we see the following exception in the region
> server :
>
> 2013-08-07 20:36:58,008 INFO org.apache.hadoop.hbase.regionserver.Store:
> Completed compaction of 18 file(s), new file=hdfs://
>
> 192.168.0.29:9000/hbase/UsageHistoryMA/1f50c6795c7753315f1fbc04946753d1/d/3311452476716076182
> ,
> size=320.2m; total size for store is 320.2m
> 2013-08-07 20:36:58,008 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> completed compaction on region
> UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00
> \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1. after 1mins,
> 51sec
> 2013-08-07 20:36:58,009 INFO
> org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of
> region UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00
> \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1.
> 2013-08-07 20:36:58,010 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Closing UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00
> \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1.: disabling
> compactions & flushes
> 2013-08-07 20:36:58,010 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Updates disabled for region UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00
> \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1.
> 2013-08-07 20:36:58,010 DEBUG org.apache.hadoop.hbase.regionserver.Store:
> closed d
> 2013-08-07 20:36:58,010 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00
> \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1.
> 2013-08-07 20:36:58,029 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Instantiated UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00
> \x12u'X\x83,1375900618008.13150e07893adb4eded6d4dc98374e9e.
> 2013-08-07 20:36:58,031 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Instantiated UsageHistoryMA,'v\x13\x07\x01\x00\x00\x00\x00
> \x12v`\x12\x15,1375900618008.6e9d9b93a9509909ed5c4d9e2bd321a8.
> 2013-08-07 20:36:58,038 INFO org.apache.hadoop.hbase.catalog.MetaEditor:
> Offlined parent region UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00
> \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1. in META
> 2013-08-07 20:36:58,085 DEBUG org.apache.hadoop.hbase.regionserver.Store:
> loaded hdfs://
>
> 192.168.0.29:9000/hbase/UsageHistoryMA/6e9d9b93a9509909ed5c4d9e2bd321a8/d/3311452476716076182.1f50c6795c7753315f1fbc04946753d1
> ,
> isReference=true, isBulkLoadResult=false, seqid=26966370,
> majorCompaction=false
> 2013-08-07 20:36:58,087 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Onlined UsageHistoryMA,'v\x13\x07\x01\x00\x00\x00\x00
> \x12v`\x12\x15,1375900618008.6e9d9b93a9509909ed5c4d9e2bd321a8.; next
> sequenceid=26966371
> 2013-08-07 20:36:58,087 DEBUG
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
> requested for UsageHistoryMA,'v\x13\x07\x01\x00\x00\x00\x00
> \x12v`\x12\x15,1375900618008.6e9d9b93a9509909ed5c4d9e2bd321a8. because
> Region has references on open; priority=99, compaction queue size=18
> 2013-08-07 20:36:58,092 INFO org.apache.hadoop.hbase.catalog.MetaEditor:
> Added daughter UsageHistoryMA,'v\x13\x07\x01\x00\x00\x00\x00
> \x12v`\x12\x15,1375900618008.6e9d9b93a9509909ed5c4d9e2bd321a8. in region
> .META.,,1, serverInfo=dl360x2807,60020,1374636004119
> 2013-08-07 20:36:58,093 INFO
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Running
> rollback/cleanup of failed split of
> UsageHistoryMA,'u\x13\x07\x01\x00\x00\x00\x00
> \x12u'X\x83,1375898307352.1f50c6795c7753315f1fbc04946753d1.; Failed
>
> dl360x2807,60020,1374636004119-daughterOpener=13150e07893adb4eded6d4dc98374e9e
>
> java.io.IOException: Failed
>
> dl360x2807,60020,1374636004119-daughterOpener=13150e07893adb4eded6d4dc98374e9e
>
>         at
>
> org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:307)
>
>         at
>
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.split(CompactSplitThread.java:205)
>
>         at
>
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:135)
>
> Caused by: java.util.ConcurrentModificationException
>         at java.util.SubList.checkForComodification(AbstractList.java:752)
>         at java.util.SubList.size(AbstractList.java:625)
>         at java.util.AbstractList.add(AbstractList.java:91)
>         at
>
> org.apache.hadoop.hbase.monitoring.TaskMonitor.createStatus(TaskMonitor.java:75)
>
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:346)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2860)
>         at
>
> org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:383)
>
>         at
>
> org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:352)
>
> 2013-08-07 20:36:58,112 FATAL
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server
> serverName=dl360x2807,60020,1374636004119, load=(requests=91, regions=170,
> usedHeap=7213, maxHeap=32730): Abort; we got an error after
> point-of-no-return
> 2013-08-07 20:36:58,113 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
> requests=30, regions=170, stores=171, storefiles=167,
> storefileIndexSize=134, memstoreSize=187, mbInMemoryWithoutWAL=0,
> numberOfPutsWithoutWAL=0, compactionQueueSize=17, flushQueueSize=0,
> usedHeap=6992, maxHeap=32730, blockCacheSize=3028798008,
> blockCacheFree=7267346888, blockCacheCount=51548,
> blockCacheHitCount=55248138, blockCacheMissCount=3593839,
> blockCacheEvictedCount=0, blockCacheHitRatio=93,
> blockCacheHitCachingRatio=99
> 2013-08-07 20:36:58,119 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Abort; we got
> an error after point-of-no-return
> 2013-08-07 20:36:58,119 INFO
> org.apache.hadoop.hbase.regionserver.CompactSplitThread:
> regionserver60020.compactor exiting
> 2013-08-07 20:36:59,161 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> server on 60020
>
> Could someone pls let me know as to why the region split failed & why the
> RS went down. According to me, the ConcurrentModificationException looks
> really trivial.
>
>
> Regards,
> Prasad
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message