cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10688) Stack overflow from SSTableReader$InstanceTidier.runOnClose in Leak Detector
Date Thu, 07 Jan 2016 16:51:39 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15087684#comment-15087684
] 

Benedict commented on CASSANDRA-10688:
--------------------------------------

It would eat up one CPU for a while, possibly, but it would eventually find any problems if
they're there, and that's better than never doing so.  We _shouldn't_ have lots of gigantic
exponential object graphs reachable from any of these things we're exploring, though. 

If we wanted to, we could:


- track how many explorations we had to perform for each node in our path, and save only those
that are above some threshold, thus guaranteeing a bound on the exponential component (or
store up to our maximum, then begin evicting those with the least explorations saved)  
- have a separate leaky buffer that saves every object we visit (optionally have a separate
one for each level in the path, that we update on ascent); if we reach our limit we don't
stop, we just don't incur any time savings.
- make the visited-set large, have a black-list for storing in the visited-set, such as KeyCache,
and log strong warnings if we exceed our limit (since we shouldn't be able to)
- any mixture of the above
- just make the total visited set unbounded

If we are generally worried about running time, and we're retaining any generalised visited
set, we could also construct a set of all our roots, then store any roots we reach from any
other root; then we can terminate early any given root exploration by just checking if we're
already associated with an object that has previously been visited.

> Stack overflow from SSTableReader$InstanceTidier.runOnClose in Leak Detector
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10688
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10688
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths, Testing
>            Reporter: Jeremiah Jordan
>            Assignee: Ariel Weisberg
>             Fix For: 3.0.x
>
>
> Running some tests against cassandra-3.0 9fc957cf3097e54ccd72e51b2d0650dc3e83eae0
> The tests are just running cassandra-stress write and read while adding and removing
nodes from the cluster.  After the test runs when I go back through logs I find the following
Stackoverflow fairly often:
> ERROR [Strong-Reference-Leak-Detector:1] 2015-11-11 00:04:10,638  Ref.java:413 - Stackoverflow
[private java.lang.Runnable org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier.runOnClose,
final java.lang.Runnable org.apache.cassandra.io.sstable.format.SSTableReader$DropPageCache.andThen,
final org.apache.cassandra.cache.InstrumentingCache org.apache.cassandra.io.sstable.SSTableRewriter$InvalidateKeys.cache,
private final org.apache.cassandra.cache.ICache org.apache.cassandra.cache.InstrumentingCache.map,
private final com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap org.apache.cassandra.cache.ConcurrentLinkedHashCache.map,
final com.googlecode.concurrentlinkedhashmap.LinkedDeque com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap.evictionDeque,
com.googlecode.concurrentlinkedhashmap.Linked com.googlecode.concurrentlinkedhashmap.LinkedDeque.first,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node 
> ....... (repeated a whole bunch more) .... 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.next,
final java.lang.Object com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node.key,
public final byte[] org.apache.cassandra.cache.KeyCacheKey.key



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message