cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Omid Aladini (JIRA)" <>
Subject [jira] [Created] (CASSANDRA-9458) Race condition causing StreamSession to get stuck in WAIT_COMPLETE
Date Fri, 22 May 2015 14:31:19 GMT
Omid Aladini created CASSANDRA-9458:

             Summary: Race condition causing StreamSession to get stuck in WAIT_COMPLETE
                 Key: CASSANDRA-9458
             Project: Cassandra
          Issue Type: Bug
            Reporter: Omid Aladini
            Priority: Critical
             Fix For: 2.0.16

I think there is a race condition in StreamSession where one side of the stream could get
stuck in WAIT_COMPLETE although both have sent COMPLETE messages. Consider a scenario that
node B is being bootstrapped and it only receives files during the session:

1- During a stream session A sends some files to B and B sends no files to A.
2- Once B completes the last task (receiving), StreamSession::maybeComplete is invoked.
3- While B is sending the COMPLETE message via StreamSession::maybeComplete, it also receives
the COMPLETE message from A and therefore StreamSession::complete() is invoked.
4- Therefore both maybeComplete() and complete() functions have branched into the state !=
State.WAIT_COMPLETE case and both set the state to WAIT_COMPLETE.
5- Now B is waiting to receive COMPLETE although it's already received it and nothing triggers
checking the state again, until it times out after streaming_socket_timeout_in_ms.

In the log below:

although the node has received COMPLETE, "SocketTimeoutException" is thrown after streaming_socket_timeout_in_ms
(30 minutes here).

This message was sent by Atlassian JIRA

View raw message