aurora-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Sirois (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AURORA-1729) Unclean curator teardown when scheduler fails a log write.
Date Sun, 03 Jul 2016 20:58:10 GMT

    [ https://issues.apache.org/jira/browse/AURORA-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360681#comment-15360681
] 

John Sirois commented on AURORA-1729:
-------------------------------------

Noting that this issue is tied up with the current lack of support for guava Services with
dependencies.  As a result, there is a split between guice/ShutdownRegistry lifecycles (which
do handle dependencies correctly), and ServiceManager which does not - ~[by design|https://groups.google.com/forum/#!searchin/guava-discuss/ServiceManager/guava-discuss/NazZoQs80oE/J01_KoLASEsJ].

One way to unify these might be to fundamentally use ShutdownRegistry and do service startup
on behalf of ServiceManager since it is designed to handle extenal Service manipulation.

> Unclean curator teardown when scheduler fails a log write.
> ----------------------------------------------------------
>
>                 Key: AURORA-1729
>                 URL: https://issues.apache.org/jira/browse/AURORA-1729
>             Project: Aurora
>          Issue Type: Bug
>          Components: Scheduler, Service Discovery
>            Reporter: John Sirois
>            Assignee: John Sirois
>
> As discovered by [~StephanErb] [here|https://www.irccloud.com/pastebin/E9ExthBO/]:
> {noformat}
> W0701 12:14:53.070 [Thread-11, GuiceUtils$4:163] Trapped uncaught exception: org.apache.aurora.scheduler.storage.Storage$StorageException:
There was a problem committing the transaction to the log. org.apache.aurora.scheduler.storage.Storage$StorageException:
There was a problem committing the transaction to the log.
>         at org.apache.aurora.scheduler.storage.log.LogStorage.lambda$doInTransaction$85(LogStorage.java:524)
~[aurora-0.14.0.jar:na]
>         at org.apache.aurora.scheduler.storage.log.LogStorage$$Lambda$184/796968060.apply(Unknown
Source) ~[na:na]
>         at org.apache.aurora.scheduler.storage.db.DbStorage.transactionedWrite(DbStorage.java:161)
~[aurora-0.14.0.jar:na]
>         at org.mybatis.guice.transactional.TransactionalMethodInterceptor.invoke(TransactionalMethodInterceptor.java:101)
~[mybatis-guice-3.7.jar:3.7]
> ...skipping...
> I0701 12:15:08.627 [Thread-11, StateMachine$Builder:389] SchedulerLifecycle state machine
transition LEADER_AWAITING_REGISTRATION -> DEAD
> I0701 12:15:08.632922 10792 sched.cpp:1907] Asked to stop the driver
> I0701 12:15:08.633 [Thread-11, StateMachine$Builder:389] storage state machine transition
READY -> STOPPED
> I0701 12:15:08.636 [Curator-Framework-0, CuratorFrameworkImpl:821] backgroundOperationsLoop
exiting
> I0701 12:15:08.657 [main-EventThread, ClientCnxn$EventThread:519] EventThread shut down
for session: 0x253eae5a8df106a
> I0701 12:15:08.658 [Thread-11, ZooKeeper:684] Session: 0x253eae5a8df106a closed
> I0701 12:15:08.660 [main, SchedulerMain:98] Stopping scheduler services.
> E0701 12:15:08.661 [Curator-PathChildrenCache-0, PathChildrenCache:571]  java.lang.IllegalStateException:
instance must be started before calling this method
>         at com.google.common.base.Preconditions.checkState(Preconditions.java:174) ~[guava-19.0.jar:na]
>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.getChildren(CuratorFrameworkImpl.java:391)
~[curator-framework-2.10.0.jar:na]
>         at org.apache.curator.framework.recipes.cache.PathChildrenCache.refresh(PathChildrenCache.java:508)
~[curator-recipes-2.10.0.jar:na]
>         at org.apache.curator.framework.recipes.cache.RefreshOperation.invoke(RefreshOperation.java:35)
~[curator-recipes-2.10.0.jar:na]
>         at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:772)
~[curator-recipes-2.10.0.jar:na]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_45-internal]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_45-internal]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_45-internal]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_45-internal]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_45-internal]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_45-internal]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45-internal]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message