mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dominic Hamon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MESOS-1866) Race between ~Authenticator() and Authenticator::authenticate() can lead to schedulers/slaves to never get authenticated
Date Mon, 06 Oct 2014 21:27:33 GMT

     [ https://issues.apache.org/jira/browse/MESOS-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dominic Hamon updated MESOS-1866:
---------------------------------
    Story Points: 2

> Race between ~Authenticator() and Authenticator::authenticate() can lead to schedulers/slaves
to never get authenticated
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-1866
>                 URL: https://issues.apache.org/jira/browse/MESOS-1866
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Vinod Kone
>            Assignee: Vinod Kone
>            Priority: Critical
>
> The master might get a duplicate authenticate() request while a previous authentication
attempt is in progress. Depending on what the AuthenticatorProcess is executing at the time,
there are 2 possible race conditions which will cause scheduler/slave to continuously retry
authentication but never succeed.
> We have seen both the race conditions in a heavily loaded production cluster.
> Race1:
> ----------
> --> An authenticate() event was dispatched to AuthenticatorProcess (Master::authenticate()
called Authenticator::authenticate())
> --> A terminate() event was then injected into the front of the AuthenticatorProcess
queue (duplicate Master::authenticate() did ~Authenticator) before the above authenticate()
event was executed.
> --> Due to the bug in libprocess, the future returned by Master::authenticate() was
never transitioned to discarded (Master::_authenticate() was never called).
> --> This caused all the subsequent authentication retries to be enqueued on the master
waiting for Master::_authenticate() to be executed.
> Fix: Transition the dispatched future to discarded if the libprocess is terminated (https://reviews.apache.org/r/25945/)
> Race 2:
> -----------
> --> An authenticate() event was dispatched to AuthenticatorProcess (Master::authenticate()
called Authenticator::authenticate())
> --> AuthenticatorProcess::authenticate() executed and set promise.onDiscard(defer(self,
Self::discarded)). NOTE: The internal promise of AuthenticatorProcess is discarded in AuthenticatorProcess::discarded()
> --> A terminate() event was then injected into the front of the AuthenticatorProcess
queue (duplicate Master::authenticate() did 
> ~Authenticator) before the above discarded() event was executed)
> --> ~AuthenticatorProcess is destructed without ever discarding the internal promise
(Master::_authenticate() was never called).
> --> This caused all the subsequent authentication retries to be enqueued on the master
waiting for Master::_authenticate() to be executed.
> Fix: The fix here is to discard the internal promise when the AuthenticatorProcess is
destructed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message