aurora-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Sweeney (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (AURORA-667) aurora ConcurrentModificationException if specific job is PENDING/THROTTLED
Date Tue, 26 Aug 2014 18:13:58 GMT

    [ https://issues.apache.org/jira/browse/AURORA-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111070#comment-14111070
] 

Kevin Sweeney edited comment on AURORA-667 at 8/26/14 6:12 PM:
---------------------------------------------------------------

Looking at the docs for synchronizedMultimap it looks like we need to synchronize on the multimap
access, not the returned collection.

So the MVP fix would be
{code}
 for (K key : keys) {
-  builder.addAll(index.get(key));
+  synchronized (index) {
+    builder.addAll(index.get(key));
+  }
   return build.build();
 }
{code}

If we wanted a fully consistent lookup we can synchronize on index outside the for-loop.

Of course, this is one of the many reasons we're continuing to get out of business of implementing
a database per AURORA-286.


was (Author: ksweeney):
Looking at the docs for synchronizedMultimap it looks like we need to synchronize on the multimap
access, not the returned collection.

So the MVP fix would be
{code}
 for (K key : keys) {
-  builder.addAll(index.get(key));
+  synchronized (index) {
+    builder.addAll(index.get(key));
+  }
   return build.build();
 }
{code}

If we wanted a fully consistent lookup we can syncronize on index outside the for-loop.

Of course, this is one of the many reasons we're continuing to get out of business of implementing
a database per AURORA-286.

> aurora ConcurrentModificationException if specific job is PENDING/THROTTLED
> ---------------------------------------------------------------------------
>
>                 Key: AURORA-667
>                 URL: https://issues.apache.org/jira/browse/AURORA-667
>             Project: Aurora
>          Issue Type: Bug
>          Components: Scheduler
>            Reporter: Bhuvan Arumugam
>
> I'm running into this issue when a specific job {{armijo-prod-passive-check}} is THROTTLED
or PENDING. Other jobs when they go to THROTTLED or PENDING, we don't face this exception.
> Can you let me know why we face this exception on specific job? I could replicate it
in one of my cluster. Let me know if you need aurora config.
> We are running a week old scheduler, as of this commit: https://github.com/apache/incubator-aurora/commit/20bb549ba3bd2fe0aeafab4275bd3b701c1b46f6
> {code}
> I0826 17:15:52.392 THREAD969679 com.twitter.common.util.StateMachine$Builder$1.execute:
1409073352392-armijo-prod-passive-check-424-d8e3c9ed-4017-41b9-b495-953891b000d2 stat
> e machine transition INIT -> THROTTLED
> I0826 17:15:52.392 THREAD969679 org.apache.aurora.scheduler.state.TaskStateMachine.addFollowup:
Adding work command SAVE_STATE for 1409073352392-armijo-prod-passive-check-42
> 4-d8e3c9ed-4017-41b9-b495-953891b000d2
> E0826 17:15:52.392 THREAD125 org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute:
java.util.concurrent.ExecutionException: java.util.ConcurrentModificationException
> java.util.concurrent.ExecutionException: java.util.ConcurrentModificationException
>         at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:188)
>         at org.apache.aurora.scheduler.base.AsyncUtil$1.afterExecute(AsyncUtil.java:66)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.util.ConcurrentModificationException
>         at java.util.HashMap$HashIterator.nextEntry(HashMap.java:926)
>         at java.util.HashMap$KeyIterator.next(HashMap.java:960)
>         at com.google.common.collect.AbstractMapBasedMultimap$WrappedCollection$WrappedIterator.next(AbstractMapBasedMultimap.java:486)
>         at com.google.common.collect.ImmutableCollection$Builder.addAll(ImmutableCollection.java:281)
>         at com.google.common.collect.ImmutableCollection$ArrayBasedBuilder.addAll(ImmutableCollection.java:360)
>         at com.google.common.collect.ImmutableSet$Builder.addAll(ImmutableSet.java:508)
>         at org.apache.aurora.scheduler.storage.mem.MemTaskStore$SecondaryIndex$1.apply(MemTaskStore.java:421)
>         at org.apache.aurora.scheduler.storage.mem.MemTaskStore$SecondaryIndex$1.apply(MemTaskStore.java:415)
>         at com.google.common.base.Present.transform(Present.java:71)
>         at org.apache.aurora.scheduler.storage.mem.MemTaskStore$SecondaryIndex.getMatches(MemTaskStore.java:428)
>         at org.apache.aurora.scheduler.storage.mem.MemTaskStore.matches(MemTaskStore.java:292)
>         at org.apache.aurora.scheduler.storage.mem.MemTaskStore.fetchTasks(MemTaskStore.java:122)
>         at com.twitter.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:87)
>         at org.apache.aurora.scheduler.storage.Storage$Util$2.apply(Storage.java:300)
>         at org.apache.aurora.scheduler.storage.Storage$Util$2.apply(Storage.java:297)
>         at org.apache.aurora.scheduler.storage.mem.MemStorage.weaklyConsistentRead(MemStorage.java:204)
>         at com.twitter.common.inject.TimedInterceptor.invoke(TimedInterceptor.java:87)
>         at org.apache.aurora.scheduler.storage.log.LogStorage.weaklyConsistentRead(LogStorage.java:587)
>         at org.apache.aurora.scheduler.storage.CallOrderEnforcingStorage.weaklyConsistentRead(CallOrderEnforcingStorage.java:123)
>         at org.apache.aurora.scheduler.storage.Storage$Util.weaklyConsistentFetchTasks(Storage.java:297)
>         at org.apache.aurora.scheduler.async.HistoryPruner$3.run(HistoryPruner.java:154)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         ... 2 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message