accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3745) deadlock in SourceSwitchingIterator
Date Thu, 23 Apr 2015 05:00:41 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508462#comment-14508462
] 

Josh Elser commented on ACCUMULO-3745:
--------------------------------------

Wow, great fix guys! Some thoughts if you'll humor me:

{code}
   public SourceSwitchingIterator(DataSource source, boolean onlySwitchAfterRow) {
     this(source, onlySwitchAfterRow, Collections.synchronizedList(new ArrayList<SourceSwitchingIterator>()));
+    copies.add(this);
   }
{code}

This appears to be the only place {{copies}} is accessed when {{copies}} isn't already synchronized
on (and I don't think it would even *need* to be synchronized on its own to add {{this}}).
Is there any reason to wrap the the List in a synchronizedList anymore? Would it be more straightforward
to not have the two "layers" of synchronization?

{code}
   @Override
-  public synchronized SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment
env) {
-    return new SourceSwitchingIterator(source.getDeepCopyDataSource(env), onlySwitchAfterRow,
copies);
+  public SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment env) {
+    synchronized (copies) {
+      synchronized(this){
+        SourceSwitchingIterator ssi = new SourceSwitchingIterator(source.getDeepCopyDataSource(env),
onlySwitchAfterRow, copies);
+        copies.add(ssi);
+        return ssi;
+      }
+    }
   }
{code}

I think we could avoid the nested synchronization if we actually made a copy of {{copies}}
in {[deepCopy}}, but, after sketching it out, I'm not really sure it buys much anything. I
think it primarily caught my eye because deepCopy wasn't actually copying the internal members
which seemed out of the ordinary.

At first glance, I think the patch looks good! Some new comments in here would be nice since
this class seems to be a repeat offender in terms of synchronization :)

> deadlock in SourceSwitchingIterator
> -----------------------------------
>
>                 Key: ACCUMULO-3745
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3745
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.5.0, 1.5.1, 1.5.2, 1.6.0, 1.6.1, 1.6.2
>         Environment: Large production cluster, with complex iterator trees.
>            Reporter: Eric Newton
>            Priority: Blocker
>             Fix For: 1.5.3, 1.7.0, 1.6.3
>
>         Attachments: ACCUMULO-3745-1.patch
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Details come from an offline cluster, so it's difficult to reproduce the exact details.
 A very complex iterator was running over tablet. "deepCopy" may have been called a couple
dozen times, which may have contributed to the problem.
> Relevant facts:
> A scan and a minor compaction created a deadlock which was detected by the java runtime.
> {noformat}
> "Query... ":
>   waiting to lock monitor 0x1234 (object 0x1234, a java.util.Collections$SynchronizedRandomAccessList),

>   which is held by "minor compactor 1"
> "minor compactor 1":
>  waiting to lock monitor 0x9876 (object 0x9876, a org.apache.accumulo.core.iterators.system.SourceSwitchingIterator),

>  which is held by "Query..."
> {noformat}
> Java stacks:
> {noformat}
> "Query..."
>   at java.util.Collections@SynchronizedCollection.add(Collections.java:1636)
>   - waiting to lock <0x1234> (a java.util.Collections$SynchronizedRandomAccessList)
>   at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.<init>(SourceSwitchingIterator.java:72)
>  at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.deepCopy(SourceSwitchingIterator:85)
>  - locked <0x9876> (a org.apache.accumulo.core.iterators.system.SourceSwitchingIterator)
>   ... PartialMutationSkippingIterator.deepCopy(InMememoryMap.java:113)
>  ... InMemoryMap#MemoryIterator.deepCopy(InnMemoryMap.java:623)
>  ...
> {noformat}
> and:
> {noformat}
> "minor compactor 1":
>  at org.apache.accumulo.core.iterators.system.SourceSwitchingIterarot._switchNow(SourceSwitchingIterator:171)
>  - waiting to lock <0x9876> (a org.apache.accumulo.core.iterators.system.SourceSwitchingIterator)
>  at org.apache.accumulo.iterators.system.SourceSwitchingIterator.switchNow(SourceSwitchingIterator.java:184)
>  locked <0x1234> (a java.util.Collections#SynhronizedRandomAccessList)
>  at org.apache.accumulo.tserver.InMemoryMap$MemoryIterator.switchNow(InMemoryMap.java:647)
>  ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message