accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From buttercream <buttercreamanonym...@gmail.com>
Subject Re: Tserver kills themselves from lost Zookeeper locks
Date Tue, 12 Nov 2013 17:36:55 GMT
I increased all of the servers up to 32GB of memory and confirmed that I have
the flags that you mentioned in the env file. Unfortunately within a day I
lost one of the tservers. In the tserver logs, looking at the timestamps
leading up to the event, I see:
02:00:03,835 [cache.LruBlockCache]
02:00:51,580 [tabletserver.TabletServer] DEBUG: MultiScanSess
02:01:02,267 [tabletserver.TabletServer] FATAL: Lost tablet server lock
(reason = LOCK_DELETED), exiting.

What's interesting on this one is that in the master log file, there is no
error message at that time. What I do see is this:
02:01:02,168 [master.Master] DEBUG: Finished gathering information from 2
servers in 0.01 seconds

That would mean the tserver killed itself within milliseconds of the master
getting the information successfully. Any thoughts on this one?



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Tserver-kills-themselves-from-lost-Zookeeper-locks-tp6125p6360.html
Sent from the Users mailing list archive at Nabble.com.

Mime
View raw message