cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason <jkushm...@rocketfuelinc.com>
Subject RE: Written data is lost and no exception thrown back to the client
Date Fri, 21 Aug 2015 01:28:59 GMT
What consistency level were the writes?


-----Original Message-----
From: "Robert Wille" <rwille@fold3.com>
Sent: ‎8/‎20/‎2015 18:25
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Written data is lost and no exception thrown back to the client

I wrote a data migration application which I was testing, and I pushed it too hard and the
FlushWriter thread pool blocked, and I ended up with dropped mutation messages. I compared
the source data against what is in my cluster, and as expected I have missing records. The
strange thing is that my application didn’t error out. I’ve been doing some forensics,
and there’s a lot about this that makes no sense and makes me feel very uneasy.

I use a lot of asynchronous queries, and I thought it was possible that I had bad error handling,
so I checked for errors in other, independent ways.

I have a retry policy that on the first failure logs the error and then requests a retry.
On the second failure it logs the error and then rethrows. A few retryable errors appeared
in my logs, but no fatal errors. In theory, I should have a fatal error in my logs for any
error that gets reported back to the client.

I wrap my Session object, and all queries go through this wrapper. This wrapper logs all query
errors. Synchronous queries are wrapped in a try/catch which logs and rethrows. Asynchronous
queries use a FutureCallback to log any onFailure invocations.

My logs indicate that no errors whatsoever were reported back to me. I do not understand how
I can get dropped mutation messages and not know about it. I am running 2.0.16 with datastax
Java driver 2.0.8. Three node cluster with RF=1. If someone could help me understand how this
can occur, I would greatly appreciate it. A database that errors out is one thing. A database
that errors out and makes you think everything was fine is quite another.

Thanks

Robert


Mime
View raw message