cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergio Bossa (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-6648) Race condition during node bootstrapping
Date Tue, 04 Feb 2014 12:38:09 GMT
Sergio Bossa created CASSANDRA-6648:
---------------------------------------

             Summary: Race condition during node bootstrapping
                 Key: CASSANDRA-6648
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6648
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: Sergio Bossa
            Priority: Critical


When bootstrapping a new node, data is "missing" as if the new node didn't actually bootstrap,
which I tracked down to the following scenario:

1) New node joins token ring and waits for schema to be settled before actually bootstrapping.
2) The schema scheck somewhat passes and it starts bootstrapping.
3) Bootstrapping doesn't find the ks/cf that should have received from the other node.
4) Queries at this point cause NPEs, until when later they "recover" but data is missed.

The problem seems to be caused by a race condition between the migration manager and the bootstrapper,
with the former running after the latter.
I think this is supposed to protect against such scenarios:
{noformat}
            while (!MigrationManager.isReadyForBootstrap())
            {
                setMode(Mode.JOINING, "waiting for schema information to complete", true);
                Uninterruptibles.sleepUninterruptibly(1, TimeUnit.SECONDS);
            }
{noformat}

But MigrationManager.isReadyForBootstrap() implementation is quite fragile and doesn't take
into account "slow" schema propagation.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message