cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Bestland (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10730) periodic timeout errors in dtest
Date Wed, 23 Dec 2015 18:23:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069999#comment-15069999
] 

Greg Bestland commented on CASSANDRA-10730:
-------------------------------------------

Jim,
One thing that comes to mind. We had quite a few timeout issues at one point in our jenkins
CI environment. The cause was changes made under the covers which caused disk performance
to suffer. It might be worth investigating what your disk performance looks like on those
nodes. Disk performance can cause all sorts of timeout, and schema agreement problems especially
when that disk is shared across multiple  c* nodes. Might be worth checking out.


> periodic timeout errors in dtest
> --------------------------------
>
>                 Key: CASSANDRA-10730
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10730
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jim Witschey
>            Assignee: Jim Witschey
>
> Dtests often fail with connection timeout errors. For example:
> http://cassci.datastax.com/job/cassandra-3.1_dtest/lastCompletedBuild/testReport/upgrade_tests.cql_tests/TestCQLNodes3RF3/deletion_test/
> {code}
> ('Unable to connect to any servers', {'127.0.0.1': OperationTimedOut('errors=Timed out
creating connection (10 seconds), last_host=None',)})
> {code}
> We've merged a PR to increase timeouts:
> https://github.com/riptano/cassandra-dtest/pull/663
> It doesn't look like this has improved things:
> http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest/363/testReport/
> Next steps here are
> * to scrape Jenkins history to see if and how the number of tests failing this way has
increased (it feels like it has). From there we can bisect over the dtests, ccm, or C*, depending
on what looks like the source of the problem.
> * to better instrument the dtest/ccm/C* startup process to see why the nodes start but
don't successfully make the CQL port available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message