cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10844) failed_bootstrap_wiped_node_can_join_test is failing
Date Thu, 31 Dec 2015 08:45:49 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075845#comment-15075845
] 

Stefania commented on CASSANDRA-10844:
--------------------------------------

+1, I agree with your suggested fix to backport [this commit|https://github.com/yukim/cassandra/commit/5f7fd497ae83f813078d56ba1b61f7ea322e5d5a]
to 2.1 and I verified that it fixes the test locally on Windows.

When we worked on CASSANDRA-9765 we did not run CI on all branches yet, and since the fix
went into 2.0 and CASSANDRA-7069 is only on 2.1+, it is entirely possible that the test has
been failing on 2.1 since the beginning. What I can recall is that we had several issues with
the new CASSANDRA-9765 tests on Jenkins and I had to follow up with two more pull requests
to fix them (#424 and #448). This one must have been left out.

> failed_bootstrap_wiped_node_can_join_test is failing
> ----------------------------------------------------
>
>                 Key: CASSANDRA-10844
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10844
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Streaming and Messaging, Testing
>            Reporter: Philip Thompson
>            Assignee: Joel Knighton
>             Fix For: 2.1.x
>
>         Attachments: node1.log, node2.log
>
>
> {{bootstrap_test.TestBootstrap.failed_bootstap_wiped_node_can_join_test}} is failing
on 2.1-head. The second node fails to join the cluster. I see a lot of exceptions in node1's
log, such as 
> {code}
> ERROR [STREAM-OUT-/127.0.0.2] 2015-12-11 12:06:13,778 StreamSession.java:505 - [Stream
#7b5ec5a0-a029-11e5-bad9-ffd0922f40e6] Streaming error occurred
> java.io.IOException: Broken pipe
>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_51]
>         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.8.0_51]
>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_51]
>         at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.8.0_51]
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) ~[na:1.8.0_51]
>         at org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
~[main/:na]
>         at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
~[main/:na]
>         at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
[main/:na]
>         at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
[main/:na]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
> {code}
> Which seem consistent with node2 being killed, so the bootstrap fails. But then when
restarting node2, it does not join. It *looks* like it fails to rejoin because of a false
positive in checking the 2 minute rule.
> {code}
> ERROR [main] 2015-12-11 12:06:17,954 CassandraDaemon.java:579 - Except
> ion encountered during startup
> java.lang.UnsupportedOperationException: Other bootstrapping/leaving/m
> oving nodes detected, cannot bootstrap while cassandra.consistent.rang
> emovement is true
>         at org.apache.cassandra.service.StorageService.checkForEndpoin
> tCollision(StorageService.java:559) ~[main/:na]
>         at org.apache.cassandra.service.StorageService.prepareToJoin(S
> torageService.java:789) ~[main/:na]
>         at org.apache.cassandra.service.StorageService.initServer(Stor
> ageService.java:721) ~[main/:na]
>         at org.apache.cassandra.service.StorageService.initServer(Stor
> ageService.java:612) ~[main/:na]
>         at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:387)
[main/:na]
>         at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:562)
[main/:na]
>         at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:651)
[main/:na]
> {code}
> This fails consistently locally and on cassci. Logs attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message