cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10870) pushed_notifications_test.py:TestPushedNotifications.restart_node_test flapping on C* 2.1
Date Thu, 21 Jan 2016 01:34:39 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109870#comment-15109870
] 

Stefania commented on CASSANDRA-10870:
--------------------------------------

The test checks that each time node 2 restarts, node 1 sends us exactly 3 notifications in
this order: DOWN, UP and NEW_NODE, with the correct IP address. The failures under JDK 8 and
JDK 7 are different, and they are both not limited to 2.1, they also happen on 2.2 or 3.0.
I suspect 3.2+ too.

On JDK 8, example [here|http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/174/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/],
the order of {{UP}} and {{NEW_NODE}} notifications is swapped.

{code}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'NEW_NODE', 'address': ('127.0.0.2',
9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'UP', 'address': ('127.0.0.2', 9042)}
{code}

An example for 2.2 is [here|http://cassci.datastax.com/job/cassandra-2.2_dtest_jdk8/156/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/]:

{code}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'UP', 'address': ('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'NEW_NODE', 'address': ('127.0.0.2',
9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}
{code}

On JDK 7, example [here|http://cassci.datastax.com/job/cassandra-2.1_dtest/385/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/],
we received an extra {{NEW_NODE}} notification:

{code}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'NEW_NODE', 'address': ('127.0.0.2',
9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'UP', 'address': ('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'NEW_NODE', 'address': ('127.0.0.2',
9042)}
{code}

This happened also on 3.0, except the duplicated notifications is {{UP}}, example [here|http://cassci.datastax.com/job/cassandra-3.0_dtest/508/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/]:

{code}
dtest: DEBUG: Restarting second node...
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'UP', 'address': ('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'DOWN', 'address': ('127.0.0.2', 9042)}
dtest: DEBUG: Waiting for notifications from 127.0.0.1
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'UP', 'address': ('127.0.0.2', 9042)}
dtest: DEBUG: Source 127.0.0.1 sent {'change_type': u'NEW_NODE', 'address': ('127.0.0.2',
9042)}
{code}

I'd say there is a chance we might be seeing the previous notifications of when the node starts
for the first time during the cluster start-up. If this is the case, it might be enough to
add a pause before creating the waiter or - better - only start node1, then start node2 and
wait for the 3 notifications, then enter the loop. If this does not fix it, then we really
have an issue in production code and you can assign the ticket to me. In fact, I can try to
fix the test as well if you want me to, just assign the ticket to me if that's the case.

> pushed_notifications_test.py:TestPushedNotifications.restart_node_test flapping on C*
2.1
> -----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10870
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10870
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Jim Witschey
>            Assignee: DS Test Eng
>             Fix For: 2.1.x
>
>
> This test flaps on CassCI on 2.1. [~aboudreault] Do I remember correctly that you did
some work on these tests in the past few months? If so, could you have a look and see if there's
some assumption the test makes that don't hold for 2.1?
> Oddly, it fails frequently under JDK8:
> http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/lastCompletedBuild/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/history/
> but less frequently on JDK7:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/lastCompletedBuild/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/history/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message