Recently upgraded to 0.8.1 and noticed what seems to be missing data after a commitlog replay on a single-node cluster. I start the node, insert a bunch of stuff (~600MB), stop it, and restart it. There are log messages pertaining to the commitlog replay and no errors, but some of the data is missing. If I flush before stopping the node, everything is fine, and running cfstats in the two cases shows different amounts of data in the SSTables. Moreover, the amount of data that is missing is nondeterministic. Has anyone run into this? Thanks.
Here is the output of a side-by-side diff between cfstats outputs for a single CF before restarting (left) and after (right). Somehow a 37MB memtable became a 2.9MB SSTable (note the difference in write count as well)?
Column Family: Blocks Column Family: Blocks
SSTable count: 0 | SSTable count: 1
Space used (live): 0 | Space used (live): 2907637
Space used (total): 0 | Space used (total): 2907637
Memtable Columns Count: 8198 | Memtable Columns Count: 0
Memtable Data Size: 37550510 | Memtable Data Size: 0
Memtable Switch Count: 0 | Memtable Switch Count: 1
Read Count: 0 Read Count: 0
Read Latency: NaN ms. Read Latency: NaN ms.
Write Count: 8198 | Write Count: 1526
Write Latency: 0.018 ms. | Write Latency: 0.011 ms.
Pending Tasks: 0 Pending Tasks: 0
Key cache capacity: 200000 Key cache capacity: 200000
Key cache size: 0 Key cache size: 0
Key cache hit rate: NaN Key cache hit rate: NaN
Row cache: disabled Row cache: disabled
Compacted row minimum size: 0 | Compacted row minimum size: 1110
Compacted row maximum size: 0 | Compacted row maximum size: 2299
Compacted row mean size: 0 | Compacted row mean size: 1960
Note that I patched https://issues.apache.org/jira/browse/CASSANDRA-2317 in my version, but there are no deletions involved so I don’t think it’s relevant unless I messed something up while patching.