cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Blake Eggleston (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-13797) RepairJob blocks on syncTasks
Date Thu, 24 Aug 2017 22:24:00 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Blake Eggleston updated CASSANDRA-13797:
----------------------------------------
    Reviewer: Marcus Eriksson
      Status: Patch Available  (was: Open)

[trunk|https://github.com/bdeggleston/cassandra/tree/13797]
[utest|https://circleci.com/gh/bdeggleston/cassandra/104]
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/218/]

> RepairJob blocks on syncTasks
> -----------------------------
>
>                 Key: CASSANDRA-13797
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13797
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>             Fix For: 4.0
>
>
> The thread running {{RepairJob}} blocks while it waits for the validations it starts
to complete ([see here|https://github.com/bdeggleston/cassandra/blob/9fdec0a82851f5c35cd21d02e8c4da8fc685edb2/src/java/org/apache/cassandra/repair/RepairJob.java#L185]).
However, the downstream callbacks (ie: the post-repair cleanup stuff) aren't waiting for {{RepairJob#run}}
to return, they're waiting for a result to be set on RepairJob the future, which happens after
the sync tasks have completed. This post repair cleanup stuff also immediately shuts down
the executor {{RepairJob#run}} is running in. So in noop repair sessions, where there's nothing
to stream, I'm seeing the callbacks sometimes fire before {{RepairJob#run}} wakes up, and
causing an {{InterruptedException}} is thrown.
> I'm pretty sure this can just be removed, but I'd like a second opinion. This appears
to just be a holdover from before repair coordination became async. I thought it might be
doing some throttling by blocking, but each repair session gets it's own executor, and validation
is  throttled by the fixed size executors doing the actual work of validation, so I don't
think we need to keep this around.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message