cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Brown (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6603) "hung" repair results in drain hanging
Date Fri, 17 Jan 2014 18:58:20 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13875095#comment-13875095
] 

Jason Brown commented on CASSANDRA-6603:
----------------------------------------

Looks like these lines are removed in c* 2.0 from MS.drain()

{code}setMode(Mode.DRAINING, "waiting for streaming", false);
MessagingService.instance().waitForStreaming();{code}

So, at least in 2.0, I think we won't block on streaming. [~yukim] can you comment on this?
Thanks



> "hung" repair results in drain hanging
> --------------------------------------
>
>                 Key: CASSANDRA-6603
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6603
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: 1.2.12 w/ 1.2.13 patches
>            Reporter: Chris Burroughs
>            Priority: Minor
>         Attachments: CassandraDaemon.stack, CassandraDaemon.stack2, drain.stack
>
>
> A "hung" repair (pile of outstanding streams with no visible progress) can result in
drain never completing of run.  This is a problem because restarting is a reasonable thing
to do with a node that has a hung repair, and drain is a standard part of the restart procedure.
 I have had this happen > 20 times.
> {noformat}
>  WARN [RMI TCP Connection(7752)-10.20.6.115] 2014-01-17 12:56:51,162 StorageService.java
(line 288) Stopping gossip by operator request
>  INFO [RMI TCP Connection(7752)-10.20.6.115] 2014-01-17 12:56:51,162 Gossiper.java (line
1194) Announcing shutdown
>  INFO [RMI TCP Connection(7754)-10.20.6.115] 2014-01-17 12:57:09,217 StorageService.java
(line 942) DRAINING: starting drain process
>  INFO [RMI TCP Connection(7754)-10.20.6.115] 2014-01-17 12:57:09,217 ThriftServer.java
(line 116) Stop listening to thrift clients
>  INFO [RMI TCP Connection(7754)-10.20.6.115] 2014-01-17 12:57:09,251 Gossiper.java (line
1194) Announcing shutdown
>  INFO [RMI TCP Connection(7754)-10.20.6.115] 2014-01-17 12:57:11,252 MessagingService.java
(line 694) Waiting for messaging service to quiesce
>  INFO [ACCEPT-ldc1e.clearspring.local/10.20.6.115] 2014-01-17 12:57:11,253 MessagingService.java
(line 904) MessagingService shutting down server thread.
>  ...
> wait 10 minutes with nothing happening
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message