hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Iancu <daniel.ia...@1and1.ro>
Subject live regionservers reported dead
Date Mon, 23 May 2011 16:27:07 GMT
  Hello everybody
I've run into this strange problem. We run a 6 RS cluster and suddenly 
the client application started reporting errors, region not online. In 
the web console all regionserver appeared up.  I've run hbck and got 
strange results

Number of Tables: 2
Number of live region servers: 6
Number of dead region servers: 12

Cluster was in inconsistent state. With hbase shell status 'detailed' I 
got the dead machines

12 dead servers
     search-hadoop-eu006.v300.gmx.net,60020,1305025929461
     search-hadoop-eu002.v300.gmx.net,60020,1305019508570
     search-hadoop-eu004.v300.gmx.net,60020,1305019551236
     search-hadoop-eu003.v300.gmx.net,60020,1305025688666
     search-hadoop-eu005.v300.gmx.net,60020,1305025841017
     search-hadoop-eu006.v300.gmx.net,60020,1306156842070
     search-hadoop-eu005.v300.gmx.net,60020,1305019568146
     search-hadoop-eu001.v300.gmx.net,60020,1305025543786
     search-hadoop-eu004.v300.gmx.net,60020,1305025761173
     search-hadoop-eu002.v300.gmx.net,60020,1305025611163
     search-hadoop-eu006.v300.gmx.net,60020,1305019572576
     search-hadoop-eu003.v300.gmx.net,60020,1305019547053


It appears that all live regionserver are listed as dead also. I tried 
hbck -fix and the cluster is now in Ok state but still reports 12 
machines dead as above.
I've checked the logs but nothing obvious. Any idea? We use CDH3u0.


Thanks
Daniel




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message