cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rudolf van der Leeden <>
Subject StackOverflowError with repair after bulkloading SSTables
Date Fri, 20 Jul 2012 11:25:36 GMT

I'm currently testing the restore of a Cassandra 1.1.2 snapshot.

The steps to reproduce the problem:

 - snapshot a 3-node production cluster (1.1.2) with RF=3 and LCS (leveled compaction) ==>
8GB data/node
 - create a new 3-node cluster (node1,2,3)
 - stop node1 / copy data (SSTables) from the snapshot (just one node) / start node1
 - Cassandra is opening 1185 SSTable files (*-hd-XXXX),  pending compaction tasks: 247
 - before Cassandra is starting compactions RUN:  nodetool repair -pr

The error messages in system.log :

 INFO [AntiEntropySessions:1] 2012-07-20 10:53:16,743 (line 666) [repair
#1c59b930-d259-11e1-0000-a0b0843ee1fe] new session: will sync /, /,
/ on range (113427455640312821154458202477256070485,0] for highscores.[highscore]
 INFO [AntiEntropySessions:1] 2012-07-20 10:53:16,747 (line 871) [repair
#1c59b930-d259-11e1-0000-a0b0843ee1fe] requesting merkle trees for highscore (to [/,
/, /])
 INFO [AntiEntropyStage:1] 2012-07-20 10:53:17,085 (line 206) [repair
#1c59b930-d259-11e1-0000-a0b0843ee1fe] Received merkle tree for highscore from /
 INFO [AntiEntropyStage:1] 2012-07-20 10:53:17,104 (line 206) [repair
#1c59b930-d259-11e1-0000-a0b0843ee1fe] Received merkle tree for highscore from /
ERROR [ValidationExecutor:1] 2012-07-20 10:53:17,865 (line 134)
Exception in thread Thread[ValidationExecutor:1,1,main]
        at$1.iterator(    ....  (repeating 1024

The repair command does not return. 
The repair command increases the Active/Pending counters of "AntiEntropySessions" in tpstats.

The counters never go back to 0.

After some time compaction starts as usual w/o problems.

Am I doing something wrong? The error is bound to LCS. No problem with STCS.
There is plenty of space in Java HEAP (7G) and on the disk (1.7TB). 
RAM is 15G and SWAP is 20G. This is an Amazon m1.xlarge instance with Ubuntu/Lucid Linux.

Thanks for any hints or help,
Rudolf VanderLeeden

View raw message