mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Rukletsov (JIRA)" <>
Subject [jira] (MESOS-7036) Rate limiter deadlocks during IO Switchboard-related tests
Date Tue, 31 Jan 2017 20:41:51 GMT


Alexander Rukletsov commented on MESOS-7036:

The deadlock is most probably caused by an unfortunate combination of several factors:
1) Dependency between {{iterate}} callback (that contains a reference to {{limiter}}) and
the entity ({{limiter}}) that triggers *and clears* that callback.
2) Lifetime of {{limiter}} that is bounded by the {{iterate}} callback copies.

If all but one {{iterate}} copies, which reference {{limiter}} go out of scope, the last copy
is destructed during {{clearAllCallbacks()}} on the {{limiter}} context, which leads to the

> Rate limiter deadlocks during IO Switchboard-related tests
> ----------------------------------------------------------
>                 Key: MESOS-7036
>                 URL:
>             Project: Mesos
>          Issue Type: Bug
>          Components: test, tests
>         Environment: ASF CI
>            Reporter: Greg Mann
>              Labels: flaky, mesosphere
>         Attachments: AgentAPITest.LaunchNestedContainerSessionWithTTY.txt
> This has been observed a number of times recently on the ASF CI. While I didn't look
through every single failed test log, I've noticed the failure occur during the following
> {code}
> ContentType/AgentAPITest.LaunchNestedContainerSessionWithTTY/1
> ContentType/AgentAPITest.LaunchNestedContainerSessionWithTTY/0
> IOSwitchboardTest.ContainerAttachAfterSlaveRestart
> ContentType/AgentAPITest.LaunchNestedContainerSession/1
> ContentType/AgentAPITest.LaunchNestedContainerSessionDisconnected/1
> ContentType/AgentAPIStreamingTest.AttachContainerInput/0
> IOSwitchboardTest.ContainerAttach
> ContentType/AgentAPIStreamingTest.AttachInputToNestedContainerSession/0
> {code}
> In all cases, we see the following:
> {code}
> You are waiting on process __limiter__(518)@ that it is currently executing.
> {code}
> Find attached an entire example log.

This message was sent by Atlassian JIRA

View raw message