hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1104) Doubly-assigned regions redux
Date Fri, 02 Jan 2009 01:19:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660255#action_12660255
] 

Andrew Purtell commented on HBASE-1104:
---------------------------------------

Got this on the JP cluster today:

>From master log:

2009-01-02 00:05:09,671 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close
region content,b1d86dfc01925c3b891285b12d3bf452,1230853204947
2009-01-02 00:05:12,765 INFO org.apache.hadoop.hbase.master.RegionManager: assigning region
content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 to server 10.3.134.207:60020
2009-01-02 00:05:15,778 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN:
content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 from 10.3.134.207:60020
2009-01-02 00:05:15,779 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: content,b1d86dfc01925c3b891285b12d3bf452,1230853204947
open on 10.3.134.207:60020
2009-01-02 00:05:27,699 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE:
content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 from 10.3.134.221:60020
2009-01-02 00:05:28,688 INFO org.apache.hadoop.hbase.master.RegionManager: assigning region
content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 to server 10.3.134.222:60020
2009-01-02 00:05:31,697 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 from 10.3.134.222:60020
2009-01-02 00:05:31,697 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: content,b1d86dfc01925c3b891285b12d3bf452,1230853204947
open on 10.3.134.222:60020
2009-01-02 00:05:31,698 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating
row content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 in region .META.,,1 with startcode
1230848433306 and server 10.3.134.222:60020

There are three HRS involved here.

This leads to file system trouble:

org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server
10.3.134.225:60020 for region content,b1d86dfc01925c3b891285b12d3bf452,1230853204947, row
'b1d86dfc01925c3b891285b12d3bf452', but failed after 10 attempts.
Exceptions:
java.io.IOException: java.io.IOException: HStoreScanner failed construction
	at org.apache.hadoop.hbase.regionserver.StoreFileScanner.(StoreFileScanner.java:70)
	at org.apache.hadoop.hbase.regionserver.HStoreScanner.(HStoreScanner.java:84)
	at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2119)
	at org.apache.hadoop.hbase.regionserver.HRegion$HScanner.(HRegion.java:1878)
	at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1162)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1673)
	at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
	at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:894)
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://jdc2-atr-dc-1.atr.trendmicro.com:50000/data/hbase/content/485774435/info/mapfiles/1743031531971685847/data
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:394)
	at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:679)
	at org.apache.hadoop.hbase.io.SequenceFile$Reader.(SequenceFile.java:1431)
	at org.apache.hadoop.hbase.io.SequenceFile$Reader.(SequenceFile.java:1426)
	at org.apache.hadoop.hbase.io.MapFile$Reader.createDataFileReader(MapFile.java:310)
	at org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.createDataFileReader(HBaseMapFile.java:96)
	at org.apache.hadoop.hbase.io.MapFile$Reader.open(MapFile.java:292)
	at org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.(HBaseMapFile.java:79)
	at org.apache.hadoop.hbase.io.BloomFilterMapFile$Reader.(BloomFilterMapFile.java:65)
	at org.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java:443)
	at org.apache.hadoop.hbase.regionserver.StoreFileScanner.openReaders(StoreFileScanner.java:96)
	at org.apache.hadoop.hbase.regionserver.StoreFileScanner.(StoreFileScanner.java:67)
	... 10 more


> Doubly-assigned regions redux
> -----------------------------
>
>                 Key: HBASE-1104
>                 URL: https://issues.apache.org/jira/browse/HBASE-1104
>             Project: Hadoop HBase
>          Issue Type: Bug
>         Environment: pset cluster with TRUNK.
>            Reporter: stack
>             Fix For: 0.19.0
>
>
> Testing, I see doubly assigned regions.  Below is from master log for TestTable,0000135598,1230761605500.
> {code}
> 2008-12-31 22:13:35,528 [IPC Server handler 2 on 60000] INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_SPLIT: TestTable,0000116170,1230761152219: TestTable,0000116170,1230761152219
split; daughters: TestTable,0000116170,1230761605500, TestTable,0000135598,1230761605500 from
XX.XX.XX.142:60020
> 2008-12-31 22:13:35,528 [IPC Server handler 2 on 60000] INFO org.apache.hadoop.hbase.master.RegionManager:
assigning region TestTable,0000135598,1230761605500 to server XX.XX.XX.142:60020
> 2008-12-31 22:13:38,561 [IPC Server handler 6 on 60000] INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_OPEN: TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:38,562 [HMaster] INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
TestTable,0000135598,1230761605500 open on XX.XX.XX.142:60020
> 2008-12-31 22:13:38,562 [HMaster] INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
updating row TestTable,0000135598,1230761605500 in region .META.,,1 with startcode 1230759988953
and server XX.XX.XX.142:60020
> 2008-12-31 22:13:44,640 [IPC Server handler 4 on 60000] DEBUG org.apache.hadoop.hbase.master.RegionManager:
Going to close region TestTable,0000135598,1230761605500
> 2008-12-31 22:13:50,441 [IPC Server handler 9 on 60000] INFO org.apache.hadoop.hbase.master.RegionManager:
assigning region TestTable,0000135598,1230761605500 to server XX.XX.XX.139:60020
> 2008-12-31 22:13:53,457 [IPC Server handler 5 on 60000] INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_PROCESS_OPEN: TestTable,0000135598,1230761605500 from XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [IPC Server handler 5 on 60000] INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_OPEN: TestTable,0000135598,1230761605500 from XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [HMaster] INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
TestTable,0000135598,1230761605500 open on XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [HMaster] INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
updating row TestTable,0000135598,1230761605500 in region .META.,,1 with startcode 1230759988788
and server XX.XX.XX.139:60020
> 2008-12-31 22:13:53,688 [IPC Server handler 6 on 60000] INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_CLOSE: TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:53,688 [HMaster] DEBUG org.apache.hadoop.hbase.master.HMaster: Processing
todo: ProcessRegionClose of TestTable,0000135598,1230761605500, false
> 2008-12-31 22:13:54,263 [IPC Server handler 7 on 60000] INFO org.apache.hadoop.hbase.master.RegionManager:
assigning region TestTable,0000135598,1230761605500 to server XX.XX.XX.141:60020
> 2008-12-31 22:13:57,273 [IPC Server handler 9 on 60000] INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_PROCESS_OPEN: TestTable,0000135598,1230761605500 from XX.XX.XX.141:60020
> 2008-12-31 22:14:03,917 [IPC Server handler 0 on 60000] INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_OPEN: TestTable,0000135598,1230761605500 from XX.XX.XX.141:60020
> 2008-12-31 22:14:03,917 [HMaster] INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
TestTable,0000135598,1230761605500 open on XX.XX.XX.141:60020
> 2008-12-31 22:14:03,918 [HMaster] INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
updating row TestTable,0000135598,1230761605500 in region .META.,,1 with startcode 1230759989031
and server XX.XX.XX.141:60020
> 2008-12-31 22:14:29,350 [RegionManager.metaScanner] DEBUG org.apache.hadoop.hbase.master.BaseScanner:
TestTable,0000135598,1230761605500 no longer has references to TestTable,0000116170,1230761152219
> {code}
> See how we choose to assign before we get the close back from the regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message