cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Petrov (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-13216) testall failure in org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages
Date Wed, 29 Mar 2017 12:42:41 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15947038#comment-15947038
] 

Alex Petrov edited comment on CASSANDRA-13216 at 3/29/17 12:42 PM:
-------------------------------------------------------------------

Found the problem. I didn't anticipate initially that this test is time-dependent. The initial
fix is still applicable. It's reproducible quite easily by adding a {{sleep}} of as few as
100 milliseconds around [here|https://github.com/apache/cassandra/blob/732d1af866b91e5ba63e7e2a467d99d4cb90e11f/test/unit/org/apache/cassandra/net/MessagingServiceTest.java#L112].
YMMV with an exact sleep number. 

However, I do not think there's any way we can reliably fetch latency numbers, since dropwizard
metrics reservoirs (used within [timers|https://github.com/dropwizard/metrics/blob/15dde825de1843927898a7ad3c3bb11b2913a931/metrics-core/src/main/java/com/codahale/metrics/Timer.java#L64]
are tracking real time, and snapshots we're doing (however precise) won't ever be perfect.
I've mocked the clock:

||[3.11|https://github.com/ifesdjeen/cassandra/tree/13216-followup-3.11]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-followup-3.11-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-followup-3.11-dtest/]|
||[trunk|https://github.com/ifesdjeen/cassandra/tree/13216-followup-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-followup-trunk-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-followup-trunk-dtest/]|

3.0 branch is not susceptible to this problem, since we use time-independent [Meter|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/metrics/DroppedMessageMetrics.java#L31]
instead of timer there.
Let's wait for 24 hours, I've put the utest on retry.


was (Author: ifesdjeen):
Found the problem. I didn't anticipate initially that this test is time-dependent. The initial
fix is still applicable. It's reproducible quite easily by adding a {{sleep}} of as few as
100 milliseconds around [here|https://github.com/apache/cassandra/blob/732d1af866b91e5ba63e7e2a467d99d4cb90e11f/test/unit/org/apache/cassandra/net/MessagingServiceTest.java#L112].
YMMV with an exact sleep number. 

However, I do not think there's any way we can reliably fetch latency numbers, since dropwizard
metrics reservoirs (used within [timers|https://github.com/dropwizard/metrics/blob/15dde825de1843927898a7ad3c3bb11b2913a931/metrics-core/src/main/java/com/codahale/metrics/Timer.java#L64]
are tracking real time, and snapshots we're doing (however precise) won't ever be perfect.
I've mocked the clock:

|[3.11|https://github.com/ifesdjeen/cassandra/tree/13216-3.11-followup]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-3.11-followup-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-3.11-followup-dtest/]|
|[trunk|https://github.com/ifesdjeen/cassandra/tree/13216-followup-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-trunk-followup-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-trunk-followup-dtest/]|

3.0 branch is not susceptible to this problem, since we use time-independent [Meter|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/metrics/DroppedMessageMetrics.java#L31]
instead of timer there.
Let's wait for 24 hours, I've put the utest on retry.

> testall failure in org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages
> ------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13216
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13216
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Testing
>            Reporter: Sean McCarthy
>            Assignee: Alex Petrov
>              Labels: test-failure, testall
>             Fix For: 3.0.13, 3.11.0, 4.0
>
>         Attachments: TEST-org.apache.cassandra.net.MessagingServiceTest.log, TEST-org.apache.cassandra.net.MessagingServiceTest.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.11_testall/81/testReport/org.apache.cassandra.net/MessagingServiceTest/testDroppedMessages
> {code}
> Error Message
> expected:<... dropped latency: 27[30 ms and Mean cross-node dropped latency: 2731]
ms> but was:<... dropped latency: 27[28 ms and Mean cross-node dropped latency: 2730]
ms>
> {code}{code}
> Stacktrace
> junit.framework.AssertionFailedError: expected:<... dropped latency: 27[30 ms and
Mean cross-node dropped latency: 2731] ms> but was:<... dropped latency: 27[28 ms and
Mean cross-node dropped latency: 2730] ms>
> 	at org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages(MessagingServiceTest.java:83)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message