hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dejan Menges <dejan.men...@gmail.com>
Subject How server gets into failed servers list?
Date Mon, 13 Apr 2015 08:11:53 GMT

We had some issues recently with HDFS - hardware issue with one of the
nodes, nodes died, HDFS recovered, but we figured out that something is
wrong with HBase. Checking HMaster log, we saw that bunch of our region
servers got to the famous failed servers list, and it was going on and on
until we restarted every one of them.

Are we doing something wrong? Is it possible somehow to tune this out, once
the server is in this list to forget about it or something?

Main question - how HMaster decides at all that server should be in the
failed server list, and what does this means exactly?

Was looking into HBase book, googling, but beside some generic answers
wasn't able to find anything more internal.

Thanks in advance!

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message