cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Brown (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-5426) Redesign repair messages
Date Mon, 03 Jun 2013 13:36:22 GMT


Jason Brown commented on CASSANDRA-5426:

In StreamingRepairTask.initiateStreaming(), there's this block

  StreamOut.transferSSTables(outsession, sstables, request.ranges, OperationType.AES);
  // request ranges from the remote node
  StreamIn.requestRanges(request.dst, desc.keyspace, Collections.singleton(cfstore), request.ranges,
this, OperationType.AES);
catch(Exception e) ...{code}

Is there any value in putting the StreamIn.requestRanges() in a separate try block and not
(immediately) fail if StreamOut has a problem? Then, we could potentially make some forward
progress (for the stream StreamIn) even if StreamOut fails? I'll note that 1.2 has the same
try/catch as Yuki's new work, so it has not changed in that regard.

> Redesign repair messages
> ------------------------
>                 Key: CASSANDRA-5426
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Yuki Morishita
>            Assignee: Yuki Morishita
>            Priority: Minor
>              Labels: repair
>             Fix For: 2.0
> Many people have been reporting 'repair hang' when something goes wrong.
> Two major causes of hang are 1) validation failure and 2) streaming failure.
> Currently, when those failures happen, the failed node would not respond back to the
repair initiator.
> The goal of this ticket is to redesign message flows around repair so that repair never

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message