aurora-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maxim Khutornenko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AURORA-1615) Preemptor crashes scheduler during host maintenance
Date Thu, 11 Feb 2016 22:23:18 GMT

    [ https://issues.apache.org/jira/browse/AURORA-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143611#comment-15143611
] 

Maxim Khutornenko commented on AURORA-1615:
-------------------------------------------

The maintenance mode is currently just a hint to the scheduler to avoid using hosts scheduled
for maintenance: https://github.com/apache/aurora/blob/9ed81a7db58f6a7cb308c8ac6a545705351c8c0e/src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java#L278

The above preference order makes sure that even if someone sets the entire cluster into maintenance
mode tasks will still schedule. I think we should approach this similarly in preemptor and
still allow hosts scheduled for maintenance participate in preemption rounds.

> Preemptor crashes scheduler during host maintenance
> ---------------------------------------------------
>
>                 Key: AURORA-1615
>                 URL: https://issues.apache.org/jira/browse/AURORA-1615
>             Project: Aurora
>          Issue Type: Bug
>          Components: Scheduler
>            Reporter: Maxim Khutornenko
>            Assignee: Maxim Khutornenko
>
> We have noticed an occasional scheduler failover when host maintenance is in effect:
> {noformat}
> To index multiple values under a key, use Multimaps.index.
>         at com.google.common.collect.Maps.uniqueIndex(Maps.java:1215)
>         at com.google.common.collect.Maps.uniqueIndex(Maps.java:1173)
>         at org.apache.aurora.scheduler.preemptor.PendingTaskProcessor.lambda$run$224(PendingTaskProcessor.java:130)
>         at org.apache.aurora.scheduler.storage.db.DbStorage.read(DbStorage.java:138)
>         at org.mybatis.guice.transactional.TransactionalMethodInterceptor.invoke(TransactionalMethodInterceptor.java:101)
>         at org.apache.aurora.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:83)
>         at org.apache.aurora.scheduler.storage.log.LogStorage.read(LogStorage.java:570)
>         at org.apache.aurora.scheduler.storage.CallOrderEnforcingStorage.read(CallOrderEnforcingStorage.java:113)
>         at org.apache.aurora.scheduler.preemptor.PendingTaskProcessor.run(PendingTaskProcessor.java:119)
> {noformat}
> Diffing colliding HostOffer objects revealed the only difference is in HostAttributes
maintenance mode value: 
> mode=NONE vs. mode=DRAINING
> Upon examination it appears that it's quite possible to have duplicate HostOffer instances
(same offer, same slave, different maintenance mode) due to the way [offers are accessed|https://github.com/apache/aurora/blob/9ed81a7db58f6a7cb308c8ac6a545705351c8c0e/src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java#L223-L226]
as unmodifiable view over underlying ConcurrentSkipListSet. Here is the possible sequence:
> # Pending task processor starts [building unique index|https://github.com/apache/aurora/blob/2e2371481d9aaccd6a45ad0f442d963d5ae7a3c8/src/main/java/org/apache/aurora/scheduler/preemptor/PendingTaskProcessor.java#L128-L130]
and the offers iterator pulls OfferA with mode=None
> # A host drain operation is initiated, a HostAttributesChanged event is raised
> # OfferManager [processes|https://github.com/apache/aurora/blob/9ed81a7db58f6a7cb308c8ac6a545705351c8c0e/src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java#L243-L246]
HostAttributeChanged event and atomically [swaps|https://github.com/apache/aurora/blob/9ed81a7db58f6a7cb308c8ac6a545705351c8c0e/src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java#L315-L322]
OfferA with OfferA' (mode=DRAINING)
> # iterator.next() inside of the uniqueIndex routine pulls OfferA' and the error is raised.
> We should either copy inside a synchronized getOffers() implementation or deal with possible
duplicates at call site. I tend to think copying on access is a better approach. The only
consumer of getOffers() is PendingTaskProcessor  with a relatively infrequent run loop (1
minute), so the perf impact of making a copy of all offers within a synchronized method should
be acceptable. The alternative implies leaking the abstraction of host maintenance mode into
the preemptor, which is less than ideal. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message