Hi Aaron,

"startup is single threaded and the scrub runs before the tables are opened".

This is what I was thinking too. However, after using the debugger to trace the code, I realized that MeteredFlusher (see the "countFlushBytes" method) might open the sstables before the scrub is completed. I suppose this is the cause of the exceptions I saw.

My plan is to add a boolean flag named "scrubCompleted" at AbstractCassandraDaemon or StorageService. By default, it is false, after the scrub is completed the AbstractCassandraDaemon needs to set it to true. The MeterdFluster needs to make sure the scrub is completed by checking this boolean value and starts to do all the calculation.

Is this a good plan? or it might have side effects?

Thanks and Regards,
Boris


On Mon, Apr 15, 2013 at 4:26 AM, aaron morton <aaron@thelastpickle.com> wrote:
From the log messages, it looked like the table/keyspace was
opened before the scrubDataDirectories was executed. This created a race
condition between two threads.
Seems odd. 
AFAIK that startup is single threaded and the scrub runs before the tables are opened. See AbstractCassandraDaemon.setup()

INFO [OptionalTasks:1] 2013-04-09 02:49:39,900 SecondaryIndexManager.java
(line 184) Creating new index :
ColumnDefinition{name=6d6f62696c6974795a6f6e6555554944,
validator=org.apache.cassandra.db.marshal.UTF8Type, index_type=KEYS,
index_name='fmzd_ap_mobilityZoneUUID'}
ERROR [FlushWriter:1] 2013-04-09 02:49:39,916 AbstractCassandraDaemon.java
(line 139) Fatal exception in thread Thread[FlushWriter:1,5,main]
java.io.IOError: java.io.IOException: rename failed of
/test/db/data/fmzd/alarm.fmzd_alarm_alarmCode-hd-21-Data.db

Looks like a secondary index is being created at startup and there is an error renaming the file. 
OR
The node was shut down before the index was built and it's been rebuilt at startup.

Both of these are async operations and cause a race with scrubDirectories(). 

Probably not the log replaying because it looks like the sstables have not been opened. 

I *think* the way around this is to um…. 
* move all existing data and commit log out of the way 
* start with node with -Dcassandra.join_ring=false JVM option in cassandra-env.sh
* check that all indexes are built using nodetool cfstats
* shut it down
* put the commit log and data dirs back in place. 

All we want to do is get the system KS updated, but in 1.0 that's a serialised object and not easy to poke. 

Hope that helps. 
 
-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton

On 14/04/2013, at 3:50 PM, Boris Yen <yulinyen@gmail.com> wrote:

Hi All,

Recently, we encountered an error on 1.0.12 that prevented cassandra from
starting up. From the log messages, it looked like the table/keyspace was
opened before the scrubDataDirectories was executed. This created a race
condition between two threads. One was trying to rename files while the
other was trying to remove tmp files. I was wondering if anyone could
provide us some information or workaround for this.

INFO [MemoryMeter:1] 2013-04-09 02:49:39,868 Memtable.java (line 186)
CFS(Keyspace='fmzd', ColumnFamily='alarm.fmzd_alarm_category') liveRatio is
3.7553409423470883 (just-counted was 3.1413828689370487).  calculation took
2ms for 265 columns
INFO [SSTableBatchOpen:1] 2013-04-09 02:49:39,868 SSTableReader.java (line
153) Opening /test/db/data/fmzd/ap.fmzd_ap_meshRole-hd-2 (83 bytes)
INFO [SSTableBatchOpen:2] 2013-04-09 02:49:39,868 SSTableReader.java (line
153) Opening /test/db/data/fmzd/ap.fmzd_ap_meshRole-hd-1 (123 bytes)
INFO [Creating index: alarm.fmzd_alarm_category] 2013-04-09 02:49:39,874
ColumnFamilyStore.java (line 705) Enqueuing flush of
Memtable-alarm.fmzd_alarm_category@413535513(14025/65835 serialized/live
bytes, 275 ops)
INFO [OptionalTasks:1] 2013-04-09 02:49:39,877 SecondaryIndexManager.java
(line 184) Creating new index : ColumnDefinition{name=6d65736853534944,
validator=org.apache.cassandra.db.marshal.UTF8Type, index_type=KEYS,
index_name='fmzd_ap_meshSSID'}
INFO [SSTableBatchOpen:1] 2013-04-09 02:49:39,895 SSTableReader.java (line
153) Opening /test/db/data/fmzd/ap.fmzd_ap_meshSSID-hd-1 (122 bytes)
INFO [SSTableBatchOpen:2] 2013-04-09 02:49:39,896 SSTableReader.java (line
153) Opening /test/db/data/fmzd/ap.fmzd_ap_meshSSID-hd-2 (82 bytes)
INFO [OptionalTasks:1] 2013-04-09 02:49:39,900 SecondaryIndexManager.java
(line 184) Creating new index :
ColumnDefinition{name=6d6f62696c6974795a6f6e6555554944,
validator=org.apache.cassandra.db.marshal.UTF8Type, index_type=KEYS,
index_name='fmzd_ap_mobilityZoneUUID'}
ERROR [FlushWriter:1] 2013-04-09 02:49:39,916 AbstractCassandraDaemon.java
(line 139) Fatal exception in thread Thread[FlushWriter:1,5,main]
java.io.IOError: java.io.IOException: rename failed of
/test/db/data/fmzd/alarm.fmzd_alarm_alarmCode-hd-21-Data.db
at
org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:375)
at
org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:319)
at
org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:302)
at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:276)
at org.apache.cassandra.db.Memtable.access$400(Memtable.java:49)
at org.apache.cassandra.db.Memtable$4.runMayThrow(Memtable.java:299)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: rename failed of
/test/db/data/fmzd/alarm.fmzd_alarm_alarmCode-hd-21-Data.db
at
org.apache.cassandra.utils.FBUtilities.renameWithConfirm(FBUtilities.java:355)
at
org.apache.cassandra.io.sstable.SSTableWriter.rename(SSTableWriter.java:371)
... 9 more
INFO [SSTableBatchOpen:1] 2013-04-09 02:49:39,917 SSTableReader.java (line
153) Opening /test/db/data/fmzd/ap.fmzd_ap_mobilityZoneUUID-hd-1 (312 bytes)
INFO [FlushWriter:2] 2013-04-09 02:49:39,916 Memtable.java (line 246)
Writing Memtable-alarm.fmzd_alarm_alarmCode@402202831(2958/22542
serialized/live bytes, 58 ops)
ERROR [main] 2013-04-09 02:49:39,916 AbstractCassandraDaemon.java (line
373) Exception encountered during startup
java.io.IOError: java.io.IOException: Failed to delete
/test/db/data/fmzd/alarm.fmzd_alarm_alarmCode-tmp-hd-21-Statistics.db
at
org.apache.cassandra.db.ColumnFamilyStore.scrubDataDirectories(ColumnFamilyStore.java:372)
at
org.apache.cassandra.db.ColumnFamilyStore.scrubDataDirectories(ColumnFamilyStore.java:415)
at
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:193)
at
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:356)
at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107)
Caused by: java.io.IOException: Failed to delete
/test/db/data/fmzd/alarm.fmzd_alarm_alarmCode-tmp-hd-21-Statistics.db
at
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:54)
at
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:44)
at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:141)
at
org.apache.cassandra.db.ColumnFamilyStore.scrubDataDirectories(ColumnFamilyStore.java:368)
... 4 more
INFO [OptionalTasks:1] 2013-04-09 02:49:39,923 SecondaryIndexManager.java
(line 184) Creating new index : ColumnDefinition{name=6d6f64656c,
validator=org.apache.cassandra.db.marshal.UTF8Type, index_type=KEYS,
index_name='fmzd_ap_model'}

Thanks and Regards,
Boris