cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1715) More schema migration race conditions
Date Tue, 09 Nov 2010 02:04:08 GMT


Jonathan Ellis commented on CASSANDRA-1715:

UpdateColumnFamily doesn't acquireLocks().  (Shouldn't Migration do that so the subclasses
don't have to?)

bq. A new memtable would need to know about the updated meta settings for thresholds. The
timing here is tricky because of flushing (chances are you would have just flushed and have
an empty memtable in anyway, but one can't be too sure).

This gets a little messy code-wise (because we allow overriding memtable settings at runtime)
but not too bad.  At worst we just set the CFS values to the new migration values during application.
 I don't see any timing issues (Memtable.isThresholdViolated checks w/ the CFS each time,
it doesn't cache locally).

bq. Make sure secondary indexes are dealt with properly on updates (e.g.: not reloaded needlessly).
Writing code to detect when indexes are added/dropped is a pain compared to just rebuilding
it from scratch, but efficiency-wise it seems like a win. At least mutating you can avoid
redoing the index sampling every time.  Stopping updates in their tracks while we reload,
to change read_repair_chance, is really brutal.  (If UpdateCF doesn't actually need to acquireLocks
then never mind, but I think it does.)

bq. Efficiently dealing with SSTableReader instances--certain classes up updates wouldn't
require messing with them at all, but others would (when files move). 

What is making files move here?

> More schema migration race conditions
> -------------------------------------
>                 Key: CASSANDRA-1715
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7 beta 1
>            Reporter: Jonathan Ellis
>            Assignee: Gary Dusbabek
>            Priority: Critical
>             Fix For: 0.7.0
>         Attachments: v1-0001-take-drop-off-CompactionManager.txt, v1-0002-compaction-lock.txt,
v1-0003-migration-uses-locks.txt, v1-0004-handle-moved-dropped-CF-prior-to-pending-compaction-st.txt
> Related to CASSANDRA-1631.
> This is still a bug with schema updates to an existing CF, since reloadCf is doing a
unload/init cycle. So flushing + compaction is an issue there as well. Here is a stacktrace
from during an index creation where it stubbed its toe on an incomplete sstable from an in-progress
compaction (path names anonymized):
> {code}
> INFO [CompactionExecutor:1] 2010-11-02 16:31:00,553 (line 224)
Compacting ['Standard1-e-6-Data.db'),'Standard1-e-7-Data.db'),'Standard1-e-8-Data.db'),'Standard1-e-9-Data.db')]
> ...
> ERROR [MigrationStage:1] 2010-11-02 16:31:10,939 (line 244) Corrupt
sstable Standard1-tmp-e-10-<>=[Data.db, Index.db]; skipped
>         at org.apache.cassandra.utils.FBUtilities.skipShortByteArray(
>         at
>         at
>         at
>         at org.apache.cassandra.db.ColumnFamilyStore.<init>(
>         at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(
>         at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(
>         at org.apache.cassandra.db.Table.initCf(
>         at org.apache.cassandra.db.Table.reloadCf(
>         at org.apache.cassandra.db.migration.UpdateColumnFamily.applyModels(
>         at org.apache.cassandra.db.migration.Migration.apply(
>         at org.apache.cassandra.thrift.CassandraServer$
>         at java.util.concurrent.FutureTask$Sync.innerRun(
>         at
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
>         at java.util.concurrent.ThreadPoolExecutor$
>         at
> ...
>  INFO [CompactionExecutor:1] 2010-11-02 16:31:31,970 (line 303)
Compacted to Standard1-tmp-e-10-Data.db.  213,657,983 to 213,657,983 (~100% of original) bytes
for 626,563 keys.  Time: 31,416ms.
> {code}
> There is also a race between schema modification and streaming.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message