We are running cassandra 1.0.12. From time to time, we see log message like "java.io.IOError: java.io.IOException: dataSize of 71530420 starting at 587 would be larger than file {cf name} ..." inside system.log.

If the cf name is not for secondary index, running "scrub" seems to prevent the log message from being logged into system.log again. However, when the cf name is for secondary indexes, there seems no way to make this error go away unless I manually remove the data files from cassandra.

I tried to patch the cassandra to make running scrub on secondary indexes possible. However, I see the statement "assert !cfs.isIndex();" in doScurb(). This make me feel like the scrub is not intent to be run on secondary indexes. My question is "why make such limitation on secondary indexes?". The implementation of secondary should be like normal column family. Running scrub on it should be legitimate.

Any suggestion to get rid of the error message?

