cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Tunnicliffe (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-9201) Race condition on creating system_auth.roles and using it on startup
Date Wed, 29 Apr 2015 09:42:06 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sam Tunnicliffe updated CASSANDRA-9201:
---------------------------------------
    Attachment: 9201.txt

When multiple nodes are started concurrently, StorageService#doAuthSetup can race with defs
updates. 

The failure in the logs attached goes like this:

# All 4 nodes are started within a couple of hundred ms 
# Nodes 2, 3 & 4 detect that the {{system_auth}} ks is not present and create & announce
it
# Node 1 receives a {{DefinitionsUpdate}} from one of the peers and starts to process it on
the {{MigrationStage}}
# Node 1 adds the new keyspace to {{Schema.instance}} in {{LegacySchemaTables.mergeKeyspaces}},
also in the {{MigrationStage}} thread
# Back in the {{main}} thread, Node 1 enters {{StorageService#doAuthSetup}}. The auth keyspace
has been added, so it doesn't attempt to create it
# Still in the {{main}} thread, Node 1 now moves onto the {{setup}} method of the {{IRoleManager}},
which in the configured impl attempts to prepare a statement against a {{system_auth.roles}}.
# This fails as the {{MigrationStage}} thread hasn't yet processed the tables in the defs
update.

As the table definitions in {{AuthKeyspace}} use deterministic IDs, their creation is idempotent
so it's safe to *always* call {{maybeAddTable}}, which eliminates the race. 

Unfortunately, I haven't been able to repro the dtest failures locally so I can't prove that
this fixes the issue, but the race seems pretty evident from the attached logs.

> Race condition on creating system_auth.roles and using it on startup
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-9201
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9201
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Tyler Hobbs
>            Assignee: Sam Tunnicliffe
>             Fix For: 3.0
>
>         Attachments: 9201.txt, node2.log
>
>
> It looks like it's possible for {{system_auth.roles}} to have a statement prepared against
it before the table exists on startup:
> {noformat}
> ERROR [main] 2015-04-15 15:12:35,626 CassandraDaemon.java: Exception encountered during
startup
> java.lang.AssertionError: org.apache.cassandra.exceptions.InvalidRequestException: unconfigured
table roles
>     at org.apache.cassandra.auth.CassandraRoleManager.prepare(CassandraRoleManager.java:427)
~[main/:na]
>     at org.apache.cassandra.auth.CassandraRoleManager.setup(CassandraRoleManager.java:139)
~[main/:na]
>     at org.apache.cassandra.service.StorageService.doAuthSetup(StorageService.java:1009)
~[main/:na]
>     at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:936)
~[main/:na]
>     at org.apache.cassandra.service.StorageService.initServer(StorageService.java:670)
~[main/:na]
>     at org.apache.cassandra.service.StorageService.initServer(StorageService.java:557)
~[main/:na]
>     at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:412) [main/:na]
>     at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:561)
[main/:na]
>     at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:668) [main/:na]
> Caused by: org.apache.cassandra.exceptions.InvalidRequestException: unconfigured table
roles
>     at org.apache.cassandra.thrift.ThriftValidation.validateColumnFamily(ThriftValidation.java:115)
~[main/:na]
>     at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:739)
~[main/:na]
>     at org.apache.cassandra.auth.CassandraRoleManager.prepare(CassandraRoleManager.java:423)
~[main/:na]
>     ... 8 common frames omitted
> INFO  [StorageServiceShutdownHook] 2015-04-15 15:12:35,636 Gossiper.java: Announcing
shutdown
> INFO  [MigrationStage:1] 2015-04-15 15:12:35,639 Schema.java: Loading org.apache.cassandra.config.CFMetaData@57a4b158[cfId=5bc52802-de25-35ed-aeab-188eecebb090,ksName=system_auth,cfName=roles,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.ColumnToCollectionType(6d656d6265725f6f66:org.apache.cassandra.db.marshal.SetType(org.apache.cassandra.db.marshal.UTF8Type))),comment=role
definitions,readRepairChance=0.0,dcLocalReadRepairChance=0.0,gcGraceSeconds=7776000,defaultValidator=org.apache.cassandra.db.marshal.BytesType,keyValidator=org.apache.cassandra.db.marshal.UTF8Type,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=member_of,
type=org.apache.cassandra.db.marshal.SetType(org.apache.cassandra.db.marshal.UTF8Type), kind=REGULAR,
componentIndex=0, indexName=null, indexType=null}, ColumnDefinition{name=is_superuser, type=org.apache.cassandra.db.marshal.BooleanType,
kind=REGULAR, componentIndex=0, indexName=null, indexType=null}, ColumnDefinition{name=role,
type=org.apache.cassandra.db.marshal.UTF8Type, kind=PARTITION_KEY, componentIndex=null, indexName=null,
indexType=null}, ColumnDefinition{name=salted_hash, type=org.apache.cassandra.db.marshal.UTF8Type,
kind=REGULAR, componentIndex=0, indexName=null, indexType=null}, ColumnDefinition{name=can_login,
type=org.apache.cassandra.db.marshal.BooleanType, kind=REGULAR, componentIndex=0, indexName=null,
indexType=null}],compactionStrategyClass=class org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,compactionStrategyOptions={},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=3600000,caching={"keys":"ALL",
"rows_per_partition":"NONE"},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,droppedColumns={},triggers=[],isDense=false]
> INFO  [MigrationStage:1] 2015-04-15 15:12:35,654 ColumnFamilyStore.java: Initializing
system_auth.roles
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message