accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <>
Subject Re: Mutation Rejected exception with server Error 1
Date Sat, 26 Dec 2015 12:54:16 GMT
Generally speaking, rejected mutations due to resource contention is
considered a system failure, requiring a re-examination of system resources.

That requires re-architecting your ingest or adding significant resources.

You could do some substantial pre-processing of your ingest and bulk-load
the result. It will increase latency of the incoming information, but it
will reduce the pressure on accumulo.

Or, as I suggested, you could increase your processing/storage by an order
of magnitude. That is why the software is built to handle hundreds (or
more) nodes.

3-5G of swap out of 32G is not a lot. But why is it using any at all?
Pulling 3G from disk is not going to be very fast. If you must, reduce the
size of your tserver. Focus on keeping your system at zero swap.

I suggest, again, that you consider expanding your system to many more
nodes.  Accumulo is not written in hand-tuned assembler. It was written
with the knowledge that more hardware is pretty cheap, and scaling up is
better than small inefficiencies.

On Thu, Dec 24, 2015 at 5:49 AM, mohit.kaushik <>

> @ Eric:  yes I have notices 3GB to 5GB swap uses out of 32GB on servers.
> And if I will resend the mutations rejected explicitly then this may create
> a loop for mutations getting rejected again and again. Then how can I
> handle it? How did you? Am i getting it right?
> @ Josh: For one of the zookeeper host I was sharing the same drive to
> store zookeeper data and hadoop datanode. I have changed it to the same
> drive as others have. I hope this will resolve zookeeper issue. lets see
> BTW, here is my zoo.cfg
> clientPort=2181
> dataDir=/usr/local/zookeeper/data/
> syncLimit=5
> tickTime=2000
> initLimit=10
> maxClientCnxn=100
> server.1=orkash1:2888:3888
> server.2=orkash2:2888:3888
> server.3=orkash3:2888:3888
> Thanks a lot
> Mohit Kaushik
> On 12/24/2015 12:47 AM, Josh Elser wrote:
> Eric Newton wrote:
> Failure to talk to zookeeper is *really* unexpected.
> Have you noticed your nodes using any significant swap?
> Emphasis on this. Failing to connect to ZooKeeper for 60s (2*30) is a very
> long time (although, I think I have seen JVM GC pauses longer before).
> A couple of generic ZooKeeper questions:
> 1. Can you share your zoo.cfg?
> 2. Make sure that ZooKeeper has a "dedicated" drive for it's dataDir. HDFS
> DataNodes using the same drive as ZooKeeper for its transaction log can
> cause ZooKeeper to be starved for I/O throughput. A normal "spinning" disk
> is also better for ZK over SSDs (last I read).
> 3. Check OS/host level metrics on these ZooKeeper hosts during the times
> you see these failures.
> 4. Consider moving your ZooKeeper hosts to "less busy" nodes if you can.
> You can consider adding more ZooKeeper hosts to the quorum, but keep in
> mind that this will increase the minimum latency for ZooKeeper operations
> (as more nodes need to acknowledge updates n/2 + 1)
> --
> * Mohit Kaushik*
> Software Engineer
> A Square,Plot No. 278, Udyog Vihar, Phase 2, Gurgaon 122016, India
> *Tel:* +91 (124) 4969352 | *Fax:* +91 (124) 4033553
> <>interactive social intelligence at
> work...
> <>
> <>
> <>  <>
> <>
> <> ... ensuring Assurance in complexity and
> uncertainty
> *This message including the attachments, if any, is a confidential
> business communication. If you are not the intended recipient it may be
> unlawful for you to read, copy, distribute, disclose or otherwise use the
> information in this e-mail. If you have received it in error or are not the
> intended recipient, please destroy it and notify the sender immediately.
> Thank you *

View raw message