accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3745) deadlock in SourceSwitchingIterator
Date Thu, 23 Apr 2015 14:49:39 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14509165#comment-14509165
] 

Keith Turner commented on ACCUMULO-3745:
----------------------------------------

bq. This appears to be the only place copies is accessed when copies isn't already synchronized
on

The thinking there was that it has not escaped at that point and is not accessible by any
other thread.

bq. Is there any reason to wrap the the List in a synchronizedList anymore? 

probably not.

bq. Would it be more straightforward to not have the two "layers" of synchronization?

Do you mean synchronizing on {{copies}} twice?  If so thats using the same lock.

bq.  think we could avoid the nested synchronization if we actually made a copy of copies
in deepCopy

copies is meant to keep track of all deep copies so that their sources can be switched.

> deadlock in SourceSwitchingIterator
> -----------------------------------
>
>                 Key: ACCUMULO-3745
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3745
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.5.0, 1.5.1, 1.5.2, 1.6.0, 1.6.1, 1.6.2
>         Environment: Large production cluster, with complex iterator trees.
>            Reporter: Eric Newton
>            Priority: Blocker
>             Fix For: 1.5.3, 1.7.0, 1.6.3
>
>         Attachments: ACCUMULO-3745-1.patch
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Details come from an offline cluster, so it's difficult to reproduce the exact details.
 A very complex iterator was running over tablet. "deepCopy" may have been called a couple
dozen times, which may have contributed to the problem.
> Relevant facts:
> A scan and a minor compaction created a deadlock which was detected by the java runtime.
> {noformat}
> "Query... ":
>   waiting to lock monitor 0x1234 (object 0x1234, a java.util.Collections$SynchronizedRandomAccessList),

>   which is held by "minor compactor 1"
> "minor compactor 1":
>  waiting to lock monitor 0x9876 (object 0x9876, a org.apache.accumulo.core.iterators.system.SourceSwitchingIterator),

>  which is held by "Query..."
> {noformat}
> Java stacks:
> {noformat}
> "Query..."
>   at java.util.Collections@SynchronizedCollection.add(Collections.java:1636)
>   - waiting to lock <0x1234> (a java.util.Collections$SynchronizedRandomAccessList)
>   at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.<init>(SourceSwitchingIterator.java:72)
>  at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.deepCopy(SourceSwitchingIterator:85)
>  - locked <0x9876> (a org.apache.accumulo.core.iterators.system.SourceSwitchingIterator)
>   ... PartialMutationSkippingIterator.deepCopy(InMememoryMap.java:113)
>  ... InMemoryMap#MemoryIterator.deepCopy(InnMemoryMap.java:623)
>  ...
> {noformat}
> and:
> {noformat}
> "minor compactor 1":
>  at org.apache.accumulo.core.iterators.system.SourceSwitchingIterarot._switchNow(SourceSwitchingIterator:171)
>  - waiting to lock <0x9876> (a org.apache.accumulo.core.iterators.system.SourceSwitchingIterator)
>  at org.apache.accumulo.iterators.system.SourceSwitchingIterator.switchNow(SourceSwitchingIterator.java:184)
>  locked <0x1234> (a java.util.Collections#SynhronizedRandomAccessList)
>  at org.apache.accumulo.tserver.InMemoryMap$MemoryIterator.switchNow(InMemoryMap.java:647)
>  ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message