Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9FFE818764 for ; Thu, 26 Nov 2015 15:13:11 +0000 (UTC) Received: (qmail 6065 invoked by uid 500); 26 Nov 2015 15:13:11 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 6029 invoked by uid 500); 26 Nov 2015 15:13:11 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 5998 invoked by uid 99); 26 Nov 2015 15:13:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Nov 2015 15:13:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 0B87E2C1F58 for ; Thu, 26 Nov 2015 15:13:11 +0000 (UTC) Date: Thu, 26 Nov 2015 15:13:11 +0000 (UTC) From: "Sam Tunnicliffe (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-10761) Possible regression of CASSANDRA-9201 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15028964#comment-15028964 ] Sam Tunnicliffe commented on CASSANDRA-10761: --------------------------------------------- Unfortunately, it seems that the startup is still not race-free, the good news though is that dtest CI runs are proving useful in catching them. {code} [node3 ERROR] java.lang.IllegalArgumentException: Unknown CF 5bc52802-de25-35ed-aeab-188eecebb090 at org.apache.cassandra.db.Keyspace.getColumnFamilyStore(Keyspace.java:206) at org.apache.cassandra.db.Keyspace.getColumnFamilyStore(Keyspace.java:199) at org.apache.cassandra.cql3.restrictions.StatementRestrictions.(StatementRestrictions.java:161) at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepareRestrictions(SelectStatement.java:817) at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:764) at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:752) at org.apache.cassandra.auth.CassandraRoleManager.prepare(CassandraRoleManager.java:446) at org.apache.cassandra.auth.CassandraRoleManager.setup(CassandraRoleManager.java:144) at org.apache.cassandra.service.StorageService.doAuthSetup(StorageService.java:1035) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:983) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:706) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:577) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:345) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:561) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689) {code} as seen in [this dtest run|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-10761-3.0-dtest/lastCompletedBuild/testReport/paging_test/TestPagingData/test_paging_using_secondary_indexes/] highlights another race condition (I ran this same test 50 times locally without error). The migration thread has sucessfully updated the keyspace metadata to include the new table metadata but it hasn't yet initialized the {{ColumnFamilyStore}} (so in {{Schema::addTable}} the call to {{update}} has happened/is happening but not the call to {{initCf}}. In the main thread, this causes the test of the table metadata in {{maybeAddOrUpdateKeyspace}} to pass and control returns to {{doAuthSetup}}, which in turn goes onto initialize the role manager, leading to the exception over the missing CFS. One way to deal with this could be to block until we can be sure that the keyspace is not only set up in schema, but that its tables are actually usable. We only ever see this affecting {{system_auth}} because we attempt to use it very early during startup, but in theory {{system_traces}} and {{system_distributed}} are also susceptible to this race. I've pushed additional commits to the branches linked earlier to make that additional check, but for now only for {{system_auth}}. [~slebresne], any better ideas? > Possible regression of CASSANDRA-9201 > ------------------------------------- > > Key: CASSANDRA-10761 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10761 > Project: Cassandra > Issue Type: Sub-task > Reporter: Philip Thompson > Assignee: Sam Tunnicliffe > Fix For: 3.0.1, 3.1, 2.2.x > > Attachments: 10761-logs.tar.gz > > > Some dtests like {{consistency_test.TestAccuracy.test_network_topology_strategy_each_quorum_counters}} are failing with the follow auth related assertion exception > {code} > [node6 ERROR] java.lang.AssertionError: org.apache.cassandra.exceptions.InvalidRequestException: unconfigured table roles > at org.apache.cassandra.auth.CassandraRoleManager.prepare(CassandraRoleManager.java:450) > at org.apache.cassandra.auth.CassandraRoleManager.setup(CassandraRoleManager.java:144) > at org.apache.cassandra.service.StorageService.doAuthSetup(StorageService.java:1036) > at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:984) > at org.apache.cassandra.service.StorageService.initServer(StorageService.java:708) > at org.apache.cassandra.service.StorageService.initServer(StorageService.java:579) > at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:345) > at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:561) > at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689) > Caused by: org.apache.cassandra.exceptions.InvalidRequestException: unconfigured table roles > at org.apache.cassandra.thrift.ThriftValidation.validateColumnFamily(ThriftValidation.java:114) > at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:757) > at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:752) > at org.apache.cassandra.auth.CassandraRoleManager.prepare(CassandraRoleManager.java:446) > ... 8 more > {code} > This looks very similar to CASSANDRA-9201. -- This message was sent by Atlassian JIRA (v6.3.4#6332)