Thank you for the reply Aaron. Unfortunately, I could not seem to find any additional info in the logs. However, upgrading from 2.0.2 to 2.0.3 seems to have done the trick!

Best regards,
-David Laube


On Dec 11, 2013, at 6:51 PM, Aaron Morton <aaron@thelastpickle.com> wrote:

[2013-12-08 11:04:02,047] Repair session ff16c510-5ff7-11e3-97c0-5973cc397f8f for range (1246984843639507027,1266616572749926276] failed with error org.apache.cassandra.exceptions.RepairException: [repair #ff16c510-5ff7-11e3-97c0-5973cc397f8f on keyspace_name/col_family1, (1246984843639507027,1266616572749926276]] Validation failed in /10.x.x.48
the 10.x.x.48 node sent a tree response (merkle tree) to this node that did not contain the tree. This node then killed the repair session. 

Look for log messages on 10.x.x.48 that correlate with the repair session ID above. They may look like 

logger.error("Failed creating a merkle tree for " + desc + ", " + initiator + " (see log for details));

or 

logger.info(String.format("[repair #%s] Sending completed merkle tree to %s for %s/%s", desc.sessionId, initiator, desc.keyspace, desc.columnFamily));

Hope that helps. 

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting

On 10/12/2013, at 12:57 pm, Laing, Michael <michael.laing@nytimes.com> wrote:

My experience is that you must upgrade to 2.0.3 ASAP to fix this.

Michael


On Mon, Dec 9, 2013 at 6:39 PM, David Laube <dave@stormpath.com> wrote:
Hi All,

We are running Cassandra 2.0.2 and have recently stumbled upon an issue with nodetool repair. Upon running nodetool repair on each of the 5 nodes in the ring (one at a time) we observe the following exceptions returned to standard out;


[2013-12-08 11:04:02,047] Repair session ff16c510-5ff7-11e3-97c0-5973cc397f8f for range (1246984843639507027,1266616572749926276] failed with error org.apache.cassandra.exceptions.RepairException: [repair #ff16c510-5ff7-11e3-97c0-5973cc397f8f on keyspace_name/col_family1, (1246984843639507027,1266616572749926276]] Validation failed in /10.x.x.48
[2013-12-08 11:04:02,063] Repair session 284c8b40-5ff8-11e3-97c0-5973cc397f8f for range (-109256956528331396,-89316884701275697] failed with error org.apache.cassandra.exceptions.RepairException: [repair #284c8b40-5ff8-11e3-97c0-5973cc397f8f on keyspace_name/col_family2, (-109256956528331396,-89316884701275697]] Validation failed in /10.x.x.103
[2013-12-08 11:04:02,070] Repair session 399e7160-5ff8-11e3-97c0-5973cc397f8f for range (8901153810410866970,8915879751739915956] failed with error org.apache.cassandra.exceptions.RepairException: [repair #399e7160-5ff8-11e3-97c0-5973cc397f8f on keyspace_name/col_family1, (8901153810410866970,8915879751739915956]] Validation failed in /10.x.x.103
[2013-12-08 11:04:02,072] Repair session 3ea73340-5ff8-11e3-97c0-5973cc397f8f for range (1149084504576970235,1190026362216198862] failed with error org.apache.cassandra.exceptions.RepairException: [repair #3ea73340-5ff8-11e3-97c0-5973cc397f8f on keyspace_name/col_family1, (1149084504576970235,1190026362216198862]] Validation failed in /10.x.x.103
[2013-12-08 11:04:02,091] Repair session 6f0da460-5ff8-11e3-97c0-5973cc397f8f for range (-5407189524618266750,-5389231566389960750] failed with error org.apache.cassandra.exceptions.RepairException: [repair #6f0da460-5ff8-11e3-97c0-5973cc397f8f on keyspace_name/col_family1, (-5407189524618266750,-5389231566389960750]] Validation failed in /10.x.x.103
[2013-12-09 23:16:36,962] Repair session 7efc2740-6127-11e3-97c0-5973cc397f8f for range (1246984843639507027,1266616572749926276] failed with error org.apache.cassandra.exceptions.RepairException: [repair #7efc2740-6127-11e3-97c0-5973cc397f8f on keyspace_name/col_family1, (1246984843639507027,1266616572749926276]] Validation failed in /10.x.x.48
[2013-12-09 23:16:36,986] Repair session a8c44260-6127-11e3-97c0-5973cc397f8f for range (-109256956528331396,-89316884701275697] failed with error org.apache.cassandra.exceptions.RepairException: [repair #a8c44260-6127-11e3-97c0-5973cc397f8f on keyspace_name/col_family2, (-109256956528331396,-89316884701275697]] Validation failed in /10.x.x.210

The /var/log/cassandra/system.log shows similar info as above with no real explanation as to the root cause behind the exception(s).  There also does not appear to be any additional info in /var/log/cassandra/cassandra.log. We have tried restoring a recent snapshot of the keyespace in question to a separate staging ring and the repair runs successfully and without exception there. This is even after we tried insert/delete on the keyspace in the separate staging ring. Has anyone seen this behavior before and what can we do to resolve this? Any assistance would be greatly appreciated.

Best regards,
-Dave