accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dickson, Matt MR" <matt.dick...@defence.gov.au>
Subject ASSIGNED_TO_DEAD_SERVER #walogs:2 [SEC=UNOFFICIAL]
Date Tue, 17 Dec 2019 01:29:24 GMT
UNOFFICIAL


Hi,

I'm trying to recover from an issue that was caused by the table.split.threshold being set
to a very low size that then generated a massive load on zookeeper and cluster nodes timing
out communicating with zookeeper while Accumulo was splitting tablets.  This was noticed when
tablet servers were being declared dead.

I've corrected the threshold and Accumulo is back online however there are 8K unhosted tablets
that are not coming online.

Running check the checkTablets script produces the exact number of errors as there are unhosted
tablets with a message like:

4d4;blah::words::4gfv43@(host:9997[23423442344234f23fd],null,null_ is ASSIGNED_TO_DEAD_SERVER
#walogs:2

I'm not concerned if there is data in these tablets and it is lost in returning the system
to a healthy state because I suspect other Accumulo operations can't proceed while tablets
are unhosted so just need to remove these issues.

Any advice would be great.

Thanks in advance,
Matt

Mime
View raw message