cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10104) Windows dtest 3.0: jmx_test.py:TestJMX.netstats_test fails
Date Fri, 21 Aug 2015 17:29:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14707082#comment-14707082
] 

Paulo Motta commented on CASSANDRA-10104:
-----------------------------------------

The initial error was related to hinted handoff failure, but that error [went away in the
latest build|http://cassci.datastax.com/view/win32/job/cassandra-3.0_dtest_win32/lastCompletedBuild/testReport/jmx_test/TestJMX/netstats_test_2/]
with the introduction of the new [hinted handoff implementation|https://issues.apache.org/jira/browse/CASSANDRA-6230].
There is now a new problem:

{noformat}
CassandraDaemon.java:635 - Exception encountered during startup java.lang.IllegalArgumentException:
Unknown CF 5bc52802-de25-35ed-aeab-188eecebb090 \\tat org.apache.cassandra.db.Keyspace.getColumnFamilyStore(Keyspace.java:209)
~[main/:na]
at org.apache.cassandra.db.Keyspace.getColumnFamilyStore(Keyspace.java:202) ~[main/:na]
at org.apache.cassandra.cql3.restrictions.StatementRestrictions.<init>(StatementRestrictions.java:125)
~[main/:na]
at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepareRestrictions(SelectStatement.java:790)
~[main/:na]
at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:740)
~[main/:na]
at org.apache.cassandra.auth.CassandraRoleManager.prepare(CassandraRoleManager.java:423) ~[main/:na]
at org.apache.cassandra.auth.CassandraRoleManager.setup(CassandraRoleManager.java:139) ~[main/:na]
at org.apache.cassandra.service.StorageService.doAuthSetup(StorageService.java:1044) ~[main/:na]
at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:975) ~[main/:na]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:696) ~[main/:na]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:570) ~[main/:na]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:320) [main/:na]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516) [main/:na]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:622) [main/:na]
{noformat}

It seems this is a race similar to [CASSANDRA-9201|https://issues.apache.org/jira/browse/CASSANDRA-9201],
that happens when multiple nodes are started concurrently. Apparently the new auth schema
is already available but the ColumnFamilyStore is not yet created when retrieving roles on
{{CassandraRoleManager}}.

The not-so-elegant fix is to wait until the CFS is available in a busy loop before calling
{{CassandraRoleManager.setup()}}. Maybe there's a better way of synchronizing this, so I'm
open to suggestions.

The patch is available [here|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:10104-3.0]
for review.Tests will be available shortly below:
* [3.0 testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10104-3.0-testall/lastCompletedBuild/testReport/]
* [3.0 dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10104-3.0-dtest/lastCompletedBuild/testReport/]
* [trunk testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10104-trunk-testall/lastCompletedBuild/testReport/]
* [trunk dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10104-trunk-dtest/lastCompletedBuild/testReport/]



> Windows dtest 3.0: jmx_test.py:TestJMX.netstats_test fails
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-10104
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10104
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Joshua McKenzie
>            Assignee: Paulo Motta
>              Labels: Windows
>             Fix For: 3.0.x
>
>
> {noformat}
> Unexpected error in node1 node log: ['ERROR [HintedHandoff:2] 2015-08-16 23:14:04,419
CassandraDaemon.java:191 - Exception in thread Thread[HintedHandoff:2,1,main] org.apache.cassandra.exceptions.WriteFailureException:
Operation failed - received 0 responses and 1 failures \tat org.apache.cassandra.service.AbstractWriteResponseHandler.get(AbstractWriteResponseHandler.java:106)
~[main/:na] \tat org.apache.cassandra.db.HintedHandOffManager.checkDelivered(HintedHandOffManager.java:358)
~[main/:na] \tat org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:414)
~[main/:na] \tat org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:346)
~[main/:na] \tat org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:91)
~[main/:na] \tat org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:537)
~[main/:na] \tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[na:1.8.0_45] \tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
~[na:1.8.0_45] \tat java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_45]']
> -------------------- >> begin captured logging << --------------------
> dtest: DEBUG: cluster ccm directory: d:\temp\dtest-j1ttp3
> dtest: DEBUG: Nodetool command 'D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra\bin\nodetool.bat
-h localhost -p 7100 netstats' failed; exit status: 1; stdout: Starting NodeTool
> ; stderr: nodetool: Failed to connect to 'localhost:7100' - ConnectException: 'Connection
refused: connect'.
> dtest: DEBUG: removing ccm cluster test at: d:\temp\dtest-j1ttp3
> dtest: DEBUG: clearing ssl stores from [d:\temp\dtest-j1ttp3] directory
> --------------------- >> end captured logging << ---------------------
> {noformat}
> Failure history: [consistent|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/17/testReport/junit/jmx_test/TestJMX/netstats_test/history/].
Looks to have regressed on build [#5|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/5/]
which seems unlikely given the commit.
> Env: Both, though on a local run the test fails due to:
> {noformat}
> Traceback (most recent call last):
>   File "c:\src\cassandra-dtest\dtest.py", line 532, in tearDown
>     raise AssertionError('Unexpected error in %s node log: %s' % (node.name, errors))
> AssertionError: Unexpected error in node1 node log: ['ERROR [main] 2015-08-17 15:42:07,717
NoSpamLogger.java:97 - This platform does not support atomic directory streams (SecureDirectoryStream);
race conditions when loading sstable files could occurr', 'ERROR [main] 2015-08-17 15:50:43,978
NoSpamLogger.java:97 - This platform does not support atomic directory streams (SecureDirectoryStream);
race conditions when loading sstable files could occurr']
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message