cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Rohrer (Jira)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-15347) Add client testing capabilities to in-jvm tests
Date Fri, 01 Nov 2019 15:16:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-15347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964853#comment-16964853
] 

Doug Rohrer edited comment on CASSANDRA-15347 at 11/1/19 3:15 PM:
------------------------------------------------------------------

This set of PRs allows the in-jvm dtest framework to support native protocol clients, which
allows for testing of the Java client and other use-cases where it makes sense to test from
"outside" (Spark, for example).

 

Four PRs for different Cassandra versions:

2.2 [changes|https://github.com/apache/cassandra/pull/377] [Circle|https://circleci.com/workflow-run/19f5082f-eedc-4d8e-8d33-558848fddc77]
 3.0 [changes|https://github.com/apache/cassandra/pull/376] [Circle|https://circleci.com/workflow-run/ddf5b452-2a51-4d3a-9cd4-d4b279e0f280]
 3.11 [changes|https://github.com/apache/cassandra/pull/375] [Circle|https://circleci.com/workflow-run/59c4d1b6-c0c2-4179-b719-a8c041c849ff]
 Trunk [changes|https://github.com/apache/cassandra/pull/374] [Circle|https://circleci.com/workflow-run/fea3a793-bf13-4652-8b88-d29e1b513254]


The changes are more extensive than just "Add Native Transport Support," as I ran into several
reliability issues with the tests once we started allowing connectivity via the native transport,
but may have already been causing some level of instability, and to speed up test execution
times. These changes include:
 - Setting {{auto_bootstrap}} to false by default for in-jvm dtests. There was no reason to
wait for instances to bootstrap before starting tests, as the cluster is empty, which could
slow down test execution and caused some test timing issues where requests could be made before
the instance was fully ready. Tests that may need {{auto_bootstrap}} later can always set
it explicitly.
 - It was possible, especially in {{trunk}}, for tests to fail to be able to create the initial
keyspace requested in {{DistributedTestBase.init}} because of a race between a hard-coded
60-second timeout in MigrationManager {{MIGRATION_DELAY_IN_MS}} and an identical 60-second
hard-coded wait timeout in the {{SchemaChangeMonitor}}. This could occur if the instance where
the schema change was submitted did not yet see one or more other instances in its live member
list when first gossiping the schema change. There were two changes made to alleviate this
issue:
 ** Extend the {{SchemaChangeMonitor}}'s delay to 70 seconds to accommodate the {{MigrationManager}}'s
60-second delay
 **  In order to avoid the root cause, and the potential of a 70 second delay if tests hit
the race, also added a new monitor {{LiveMemberAgreementMonitor}} which waits for all instances
to agree that the live member count is equal to our expected count of instances running before
moving on from Cluster.startup. This adds a very minor potential delay to cluster startup
as we wait for the members to all see each other, but completely avoids the possibility that
the subsequent schema change will be delayed by up to 60 seconds.

There are a few other minor changes/refactorings that were picked up from Alex's original
patch for this change, which was never submitted to C*, so he was kind enough to help me put
this together and has done some early code review as well. A new test {{NativeTransportTest}}
was added to cover the native transport functionality and a new {{ResourceLeakTest}} to make
sure we weren't introducing any cross-classloader references that would block collection of
classes and exhaust java's metaspace.


was (Author: drohrer):
This set of PRs allows the in-jvm dtest framework to support native protocol clients, which
allows for testing of the Java client and other use-cases where it makes sense to test from
"outside" (Spark, for example).

 

Four PRs for different Cassandra versions:

2.2 [changes|https://github.com/apache/cassandra/pull/377] [Circle|https://circleci.com/workflow-run/19f5082f-eedc-4d8e-8d33-558848fddc77]
 3.0 [changes|https://github.com/apache/cassandra/pull/376] [Circle|https://circleci.com/workflow-run/ddf5b452-2a51-4d3a-9cd4-d4b279e0f280]
 3.11 [changes|https://github.com/apache/cassandra/pull/375] [Circle|https://circleci.com/workflow-run/59c4d1b6-c0c2-4179-b719-a8c041c849ff]
 Trunk [changes|https://github.com/apache/cassandra/pull/374] [Circle|https://circleci.com/workflow-run/fea3a793-bf13-4652-8b88-d29e1b513254]


The changes are more extensive than just "Add Native Transport Support," as I ran into several
reliability issues with the tests once we started allowing connectivity via the native transport,
but may have already been causing some level of instability, and to speed up test execution
times. These changes include:
 - Setting {{auto_bootstrap}} to false by default for in-jvm dtests. There was no reason to
wait for instances to bootstrap before starting tests, as the cluster is empty, which could
slow down test execution and caused some test timing issues where requests could be made before
the instance was fully ready. Tests that may need {{auto_bootstrap}} later can always set
it explicitly.
 - It was possible, especially in {{trunk}}, for tests to fail to be able to create the initial
keyspace requested in {{DistributedTestBase.init}} because of a race between a hard-coded
60-second timeout in MigrationManager {{MIGRATION_DELAY_IN_MS}) and an identical 60-second
hard-coded wait timeout in the {{SchemaChangeMonitor}}. This could occur if the instance where
the schema change was submitted did not yet see one or more other instances in its live member
list when first gossiping the schema change. There were two changes made to alleviate this
issue:
 ** Extend the {{SchemaChangeMonitor}}'s delay to 70 seconds to accommodate the {{MigrationManager}}'s
60-second delay
 **  In order to avoid the root cause, and the potential of a 70 second delay if tests hit
the race, also added a new monitor {{LiveMemberAgreementMonitor}} which waits for all instances
to agree that the live member count is equal to our expected count of instances running before
moving on from Cluster.startup. This adds a very minor potential delay to cluster startup
as we wait for the members to all see each other, but completely avoids the possibility that
the subsequent schema change will be delayed by up to 60 seconds.

There are a few other minor changes/refactorings that were picked up from Alex's original
patch for this change, which was never submitted to C*, so he was kind enough to help me put
this together and has done some early code review as well. A new test {{NativeTransportTest}}
was added to cover the native transport functionality and a new {{ResourceLeakTest}} to make
sure we weren't introducing any cross-classloader references that would block collection of
classes and exhaust java's metaspace.

> Add client testing capabilities to in-jvm tests
> -----------------------------------------------
>
>                 Key: CASSANDRA-15347
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15347
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/dtest
>            Reporter: Alex Petrov
>            Assignee: Doug Rohrer
>            Priority: Normal
>              Labels: patch-available, pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Allow testing native transport code path using in-jvm tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message