cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Tunnicliffe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10104) Windows dtest 3.0: jmx_test.py:TestJMX.netstats_test fails
Date Mon, 24 Aug 2015 10:24:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14709072#comment-14709072
] 

Sam Tunnicliffe commented on CASSANDRA-10104:
---------------------------------------------

It seems like the problem is that {{Schema#getCFMetaData}} only fetches the {{KSMetaData}}
from schema to get the {{CFMetaData}}, but during {{LegacySchemaTable#mergeSchema}} the {{KSMetaData}}
is added in {{mergeKeyspaces}} before the {{ColumnFamilyStore}} is initialised in {{mergeTables}}.
So, there's a window in which the {{CFMetaData}} is retrievable, but the {{ColumnFamilyStore}}
isn't yet. Could we solve this by just changing the condition checking the table to actually
retrieve the {{CFS}} rather than the {{CFM}}? In the case where we race, this would cause
us to attempt a redundant {{addTable}}, but that will be harmless.

{code}
@@ -1029,7 +1029,7 @@ public class StorageService extends NotificationBroadcasterSupport implements
IE
         // Also, the addKeyspace above can be racy if multiple nodes are started
         // concurrently - see CASSANDRA-9201
         for (Map.Entry<String, CFMetaData> table : AuthKeyspace.definition().cfMetaData().entrySet())
-            if (Schema.instance.getCFMetaData(AuthKeyspace.NAME, table.getKey()) == null)
+            if (Schema.instance.getCF(table.getValue().cfId) == null)
                 maybeAddTable(table.getValue());
{code}

I don't think we need the check for existence of the Keyspace because we know both the KS
and CF will be created ultimately, we just need to make sure we don't progress to the role
manager setup until they are and waiting for the table ensures that.

> Windows dtest 3.0: jmx_test.py:TestJMX.netstats_test fails
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-10104
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10104
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Joshua McKenzie
>            Assignee: Paulo Motta
>              Labels: Windows
>             Fix For: 3.0.x
>
>
> {noformat}
> Unexpected error in node1 node log: ['ERROR [HintedHandoff:2] 2015-08-16 23:14:04,419
CassandraDaemon.java:191 - Exception in thread Thread[HintedHandoff:2,1,main] org.apache.cassandra.exceptions.WriteFailureException:
Operation failed - received 0 responses and 1 failures \tat org.apache.cassandra.service.AbstractWriteResponseHandler.get(AbstractWriteResponseHandler.java:106)
~[main/:na] \tat org.apache.cassandra.db.HintedHandOffManager.checkDelivered(HintedHandOffManager.java:358)
~[main/:na] \tat org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:414)
~[main/:na] \tat org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:346)
~[main/:na] \tat org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:91)
~[main/:na] \tat org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:537)
~[main/:na] \tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[na:1.8.0_45] \tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
~[na:1.8.0_45] \tat java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_45]']
> -------------------- >> begin captured logging << --------------------
> dtest: DEBUG: cluster ccm directory: d:\temp\dtest-j1ttp3
> dtest: DEBUG: Nodetool command 'D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra\bin\nodetool.bat
-h localhost -p 7100 netstats' failed; exit status: 1; stdout: Starting NodeTool
> ; stderr: nodetool: Failed to connect to 'localhost:7100' - ConnectException: 'Connection
refused: connect'.
> dtest: DEBUG: removing ccm cluster test at: d:\temp\dtest-j1ttp3
> dtest: DEBUG: clearing ssl stores from [d:\temp\dtest-j1ttp3] directory
> --------------------- >> end captured logging << ---------------------
> {noformat}
> Failure history: [consistent|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/17/testReport/junit/jmx_test/TestJMX/netstats_test/history/].
Looks to have regressed on build [#5|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/5/]
which seems unlikely given the commit.
> Env: Both, though on a local run the test fails due to:
> {noformat}
> Traceback (most recent call last):
>   File "c:\src\cassandra-dtest\dtest.py", line 532, in tearDown
>     raise AssertionError('Unexpected error in %s node log: %s' % (node.name, errors))
> AssertionError: Unexpected error in node1 node log: ['ERROR [main] 2015-08-17 15:42:07,717
NoSpamLogger.java:97 - This platform does not support atomic directory streams (SecureDirectoryStream);
race conditions when loading sstable files could occurr', 'ERROR [main] 2015-08-17 15:50:43,978
NoSpamLogger.java:97 - This platform does not support atomic directory streams (SecureDirectoryStream);
race conditions when loading sstable files could occurr']
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message