hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Fiala <fial...@gmail.com>
Subject Regions assigned multiple times after disabling table
Date Thu, 08 Apr 2010 13:38:48 GMT
Hello,

last week we went into a strange error and today this happened again.
After altering table, enabling table again we got
NoSuchColumnFamilyException when working with the table for some regions.

We discovered, that the error itself is caused by some regions assigned
to multiple servers and not being really offline when the table was
disabled. HMaster claimed that all regions were disabled, but
RegionServers held some regions online.

This happened on HMaster right after disabling table 'robot':
2010-04-08 14:12:12,127 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Adding region
robot,cz.sika.www.\x2180/en/cz-ind/cz-ind-news.htm,1270552929405 to
setClosing list
2010-04-08 14:12:13,958 INFO
org.apache.hadoop.hbase.master.ServerManager: Processing
MSG_REPORT_CLOSE:
robot,cz.sika.www.\x2180/en/cz-ind/cz-ind-news.htm,1270552929405 from
fernet7-v49.ng.seznam.cz,60020,1270631603011; 60 of 70
2010-04-08 14:12:28,048 DEBUG org.apache.hadoop.hbase.master.HMaster:
Processing todo: ProcessRegionClose of
robot,cz.sika.www.\x2180/en/cz-ind/cz-ind-news.htm,1270552929405, true,
reassign: false
2010-04-08 14:12:28,049 INFO
org.apache.hadoop.hbase.master.ProcessRegionClose$1: region closed:
robot,cz.sika.www.\x2180/en/cz-ind/cz-ind-news.htm,1270552929405
2010-04-08 14:12:28,054 DEBUG
org.apache.hadoop.hbase.master.BaseScanner: GET on
robot,cz.sika.www.\x2180/en/cz-ind/cz-ind-news.htm,1270552929405 got
different startcode than SCAN: sc=0, serverAddress=1270631603011
2010-04-08 14:12:28,054 DEBUG
org.apache.hadoop.hbase.master.BaseScanner: Current assignment of
robot,cz.sika.www.\x2180/en/cz-ind/cz-ind-news.htm,1270552929405 is not
valid; serverAddress=, startCode=0 unknown.
2010-04-08 14:12:28,240 INFO
org.apache.hadoop.hbase.master.RegionManager: Assigning region
robot,cz.sika.www.\x2180/en/cz-ind/cz-ind-news.htm,1270552929405 to
fernet1-v49.ng.seznam.cz,60020,1270540750568

Now, the table is disabled, but the region is online on
"fernet1-v49.ng.seznam.cz"!! Is there some race condition?

Regards
Martin Fiala

Mime
View raw message