accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Mutation Rejected exception with server Error 1
Date Wed, 23 Dec 2015 19:17:38 GMT
Eric Newton wrote:
>
> Failure to talk to zookeeper is *really* unexpected.
>
> Have you noticed your nodes using any significant swap?

Emphasis on this. Failing to connect to ZooKeeper for 60s (2*30) is a 
very long time (although, I think I have seen JVM GC pauses longer before).

A couple of generic ZooKeeper questions:

1. Can you share your zoo.cfg?

2. Make sure that ZooKeeper has a "dedicated" drive for it's dataDir. 
HDFS DataNodes using the same drive as ZooKeeper for its transaction log 
can cause ZooKeeper to be starved for I/O throughput. A normal 
"spinning" disk is also better for ZK over SSDs (last I read).

3. Check OS/host level metrics on these ZooKeeper hosts during the times 
you see these failures.

4. Consider moving your ZooKeeper hosts to "less busy" nodes if you can. 
You can consider adding more ZooKeeper hosts to the quorum, but keep in 
mind that this will increase the minimum latency for ZooKeeper 
operations (as more nodes need to acknowledge updates n/2 + 1)

Mime
View raw message